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Preface 


This book has one purpose: to help you understand vectors and tensors so that 
you can use them to solve problems. If you’re like most students, you first 
encountered vectors when you took a course dealing with mechanics in high 
school or college. At that level, you almost certainly learned that vectors are 
mathematical representations of quantities that have both magnitude and direc- 
tion, such as velocity and force. You may also have learned how to add vectors 
graphically and by using their components in the x-, y- and z-directions. 

That’s a fine place to start, but it turns out that such treatments only scratch 
the surface of the power of vectors. You can harness that power and make it 
work for you if you’re willing to delve a bit deeper - to see vectors not just 
as objects with magnitude and direction, but rather as objects that behave in 
very predictable ways when viewed from different reference frames. That’s 
because vectors are a subset of a larger class of objects called “tensors,” which 
most students encounter much later in their academic careers, and which have 
been called “the facts of the Universe.” It is no exaggeration to say that our 
understanding of the fundamental structure of the universe was changed for- 
ever when Albert Einstein succeeded in expressing his theory of gravity in 
terms of tensors. 

I believe, and I hope you’ll agree, that tensors are far easier to understand 
if you first establish a stronger foundation in vectors, one that can help you 
cross the bridge between the “magnitude and direction” level and the “facts of 
the Universe” level. That’s why the first three chapters of this book deal with 
vectors, the fourth chapter discusses coordinate transformations, and the last 
two chapters discuss higher-order tensors and some of their applications. 

One reason you may find this book helpful is that if you spend a few hours 
looking through the published literature and on-line resources for vectors and 
tensors in physics and engineering, you’re likely to come across statements 
such as these: 
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“A vector is a mathematical representation of a physical entity characterized 
by magnitude and direction.” 

“A vector is an ordered sequence of values.” 

“A vector is a mathematical object that transforms between coordinate 
systems in certain ways.” 

“A vector is a tensor of rank one.” 

“A vector is an operator that turns a one-form into a scalar.” 

You should understand that every one of these definitions is correct, but 
whether it’s useful to you depends on the problem you’re trying to solve. 
And being able to see the relationship between statements like these should 
prove very helpful when you begin an in-depth study of subjects that use 
advanced vector and tensor concepts. Those subjects include Mechanics, 
Electromagnetism, General Relativity, and others. 

As with most projects, a good first step is to make sure you understand the 
terminology that will be used to attack the problem. For that reason, Chapter 1 
provides the basic definitions you’ll need to begin understanding vectors and 
tensors. And if you’re ready for more-advanced definitions, you can find those 
at the beginning of Chapter 5. 

You may be wondering how this book differs from other texts that deal with 
vectors and/or tensors. Perhaps the most important difference is that approx- 
imately equal weight is given to vector and tensor concepts, with one entire 
chapter (Chapter 3) devoted to selected vector applications and another chapter 
(Chapter 6) dedicated to example tensor applications. 

You’ll also find the presentation to be very different from that of other books. 
The explanations in this book are written in an informal style in which math- 
ematical rigor is maintained only insofar as it doesn’t obscure the underlying 
physics. If you feel you already have a good understanding of vectors and 
may need only a quick review, you should be able to skim through Chapters 1 
through 3 very quickly. But if you’re a bit unclear on some aspects of vectors 
and how to apply them to problems, you may find these early chapters quite 
helpful. And if you’ve already seen tensors but are unsure of exactly what they 
are or how to apply them, then Chapters 4 through 6 may provide some insight. 

As a student’s guide, this book comes with two additional resources 
designed to help you understand and apply vectors and tensors: an interactive 
website and a series of audio podcasts. On the website, you’ll find the com- 
plete solution to every problem presented in the text in interactive format - 
that means you’ll be able to view the entire solution at once, or ask for a series 
of helpful hints that will guide you to the final answer. So when you see a state- 
ment in the text saying that you can learn more about something by looking 
at the end-of-chapter problems, remember that the full solution to every one 
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of those problems is available to you. And if you’re the kind of learner who 
benefits from hearing spoken words rather than just reading text, the audio 
podcasts are for you. These MP3 files walk you through each chapter of the 
book, pointing out important details and providing further explanations of key 
concepts. 

Is this book right for you? It is if you’re a science or engineering student 
and have encountered vectors or tensors in one of your classes, but you’re 
not confident in your ability to apply them. In that case, you should read the 
book, listen to the accompanying podcasts, and work through the examples 
and problems before taking additional classes or a standardized exam in which 
vectors or tensors may appear. Or perhaps you’re a graduate student struggling 
to make the transition from undergraduate courses and textbooks to the more- 
advanced material you’re seeing in graduate school - this book may help you 
make that step. 

And if you’re neither an undergraduate nor a graduate student, but a curi- 
ous young person or a lifelong learner who wants to know more about vectors, 
tensors, or their applications in Mechanics, Electromagnetics, and General Rel- 
ativity, welcome aboard. I commend your initiative, and I hope this book helps 
you in your journey. 
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1 

Vectors 


1.1 Definitions (basic) 

There are many ways to define a vector. For starters, here’s the most basic: 


A vector is the mathematical representation of a physical entity that may be 
characterized by size (or “magnitude”) and direction. 


In keeping with this definition, speed (how fast an object is going) is not rep- 
resented by a vector, but velocity (how fast and in which direction an object is 
going) does qualify as a vector quantity. Another example of a vector quantity 
is force, which describes how strongly and in what direction something is being 
pushed or pulled. But temperature, which has magnitude but no direction, is not 
a vector quantity. 

The word “vector” comes from the Latin vehere meaning “to carry;” it was 
first used by eighteenth-century astronomers investigating the mechanism by 
which a planet is “carried” around the Sun. 1 In text, the vector nature of an 
object is often indicated by placing a small arrow over the variable representing 
the object (such as F), or by using a bold font (such as F), or by underlining 
(such as F or F). When you begin hand-writing equations involving vectors, 
it’s very important that you get into the habit of denoting vectors using one of 
these techniques (or another one of your choosing). The important thing is not 
how you denote vectors, it’s that you don’t simply write them the same way 
you write non- vector quantities. 

A vector is most commonly depicted graphically as a directed line seg- 
ment or an arrow, as shown in Figure 1.1(a). And as you’ll see later in this 
section, a vector may also be represented by an ordered set of N numbers, 

1 The Oxford English Dictionary. 2nd ed. 1989. 
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Vectors 




Figure 1.1 Graphical depiction of a vector (a) and a vector field (b). 


where N is the number of dimensions in the space in which the vector 
resides. 

Of course, the true value of a vector comes from knowing what it represents. 
The vector in Figure 1.1(a), for example, may represent the velocity of the wind 
at some location, the acceleration of a rocket, the force on a football, or any of 
the thousands of vector quantities that you encounter in the world every day. 
Whatever else you may learn about vectors, you can be sure that every one of 
them has two things: size and direction. The magnitude of a vector is usually 
indicated by the length of the arrow, and it tells you the amount of the quantity 
represented by the vector. The scale is up to you (or whoever’s drawing the 
vector), but once the scale has been established, all other vectors should be 
drawn to the same scale. Once you know that scale, you can determine the 
magnitude of any vector just by finding its length. The direction of the vector 
is usually given by indicating the angle between the arrow and one or more 
specified directions (usually the “coordinate axes”), and it tells you which way 
the vector is pointing. 

So if vectors are characterized by their magnitude and direction, does that 
mean that two equally long vectors pointing in the same direction could in fact 
be considered to be the same vector? In other words, if you were to move the 
vector shown in Figure 1.1(a) to a different location without varying its length 
or its pointing direction, would it still be the same vector? In some applications, 
the answer is “yes,” and those vectors are called free vectors. You can move 
a free vector anywhere you’d like as long as you don’t change its length or 
direction, and it remains the same vector. But in many physics and engineering 
problems, you’ll be dealing with vectors that apply at a given location ; such 
vectors are called “bound” or “anchored” vectors, and you’re not allowed to 


1.1 Definitions (basic) 


3 


relocate bound vectors as you can free vectors.' You may see the term “sliding” 
vectors used for vectors that are free to move along their length but are not free 
to change length or direction; such vectors are useful for problems involving 
torque and angular motion. 

You can understand the usefulness of bound vectors if you think about an 
application such as representing the velocity of the wind at various points in 
the atmosphere. To do that, you could choose to draw a bound vector at each 
point of interest, and each of those vectors would show the speed and direction 
of the wind at that location (most people draw the vector with its tail - the end 
without the arrow - at the point to which the vector is bound). A collection of 
such vectors is called a vector field; an example is shown in Figure 1.1(b). 

If you think about the ways in which you might represent a bound vector, 
you may realize that the vector can be defined simply by specifying the start 
and end points of the arrow. So in a three-dimensional Cartesian coordinate 
system, you only need to know the values of x, y, and z for each end of the 
vector, as shown in Figure 1.2(a) (you can read about vector representation in 
non-Cartesian coordinate systems later in this chapter). 

Now consider the special case in which the vector is anchored to the origin 
of the coordinate system (that is, the end without the arrowhead is at the point 
of intersection of the coordinate axes, as shown in Figure 1.2(b). 3 Such vectors 
may be completely specified simply by listing the three numbers that represent 
the x-, y-, and z -coordinates of the vector’s end point. Hence a vector anchored 
to the origin and stretching five units along the x-axis may be represented as 



(a) (b) 


Figure 1.2 A vector in 3-D Cartesian coordinates. 

- Mathematicians don't have much use for bound vectors, since the mathematical definition of a 
vector deals with how it transforms rather than where it’s located. 

' The vector shown in Figure 1.2 (a) can be shifted to this location by subtracting x start , y start, 
and Zstart from the values at each end. 


4 


Vectors 


(5,0,0). In this representation, the values that represent the vector are called the 
“components” of the vector, and the number of components it takes to define 
a vector is equal to the number of dimensions in the space in which the vector 
exists. So in a two-dimensional space a vector may be represented by a pair 
of numbers, and in four-dimensional spacetime vectors may appear as lists of 
four numbers. This explains why a horizontal list of numbers is called a “row 
vector” and a vertical list of numbers is called a “column vector” in computer 
science. The number of values in such vectors tells you how many dimensions 
there are in the space in which the vector resides. 

To understand how vectors are different from other entities, it may help to 
consider the nature of some things that are clearly not vectors. Think about the 
temperature in the room in which you’re sitting - at each point in the room, 
the temperature has a value, which you can represent by a single number. That 
value may well be different from the value at other locations, but at any given 
point the temperature can be represented by a single number, the magnitude. 
Such magnitude-only quantities have been called “scalars” ever since W.R. 
Hamilton referred to them as “all values contained on the one scale of progres- 
sion of numbers from negative to positive infinity.” 4 Thus 


A scalar is the mathematical representation of a physical entity that may be 
characterized by magnitude only. 


Other examples of scalar quantities include mass, charge, energy, and speed 
(defined as the magnitude of the velocity vector). It is worth noting that the 
change in temperature over a region of space does have both magnitude and 
direction and may therefore be represented by a vector, so it’s possible to pro- 
duce vectors from groups of scalars. You can read about just such a vector 
(called the “gradient” of a scalar field) in Chapter 2. 

Since scalars can be represented by magnitude only (single numbers) 
and vectors by magnitude and direction (three numbers in three-dimensional 
space), you might suspect that there are other entities involving magnitude and 
directions that are more complex than vectors (that is, requiring more numbers 
than the number of spatial dimensions). Indeed there are, and such entities are 
called “tensors.” 5 You can read about tensors in the last three chapters of this 
book, but for now this simple definition will suffice: 


4 W.R. Hamilton, Phil. Mag. XXIX, 26. 

5 As you can learn in the later portions of this book, scalars and vectors also belong to the class 
of objects called tensors but have lower rank, so in this section the word “tensors” refers to 
higher-rank tensors. 
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A tensor is the mathematical representation of a physical entity that may be 
characterized by magnitude and multiple directions. 


An example of a tensor is the inertia that relates the angular velocity of a 
rotating object to its angular momentum. Since the angular velocity vector 
has a direction and the angular momentum vector has a (potentially different) 
direction, the inertia tensor involves multiple directions. 

And just as a scalar may be represented by a single number and a vector may 
be represented by a sequence of three numbers in 3-dimensional space, a tensor 
may be represented by an array of 3 R numbers in 3-dimensional space. In this 
expression, “R” represents the rank of the tensor. So in 3-dimensional space, a 
second-rank tensor is represented by 3 2 = 9 numbers. In N-dimensional space, 
scalars still require only one number, vectors require N numbers, and tensors 
require N A ’ numbers. 

Recognizing scalars, vectors, and tensors is easy once you realize that a 
scalar can be represented by a single number, a vector by an ordered set of 
numbers, and a tensor by an array of numbers. So in three-dimensional space, 
they look like this: 

Scalar Vector Tensor (Rank 2) 



Note that scalars require no subscripts, vectors require a single subscript, 
and tensors require two or more subscripts - the tensor shown here is a tensor 
of rank 2, but you may also encounter higher-rank tensors, as discussed in 
Chapter 5. A tensor of rank 3 may be represented by a three-dimensional array 
of values. 

With these basic definitions in hand, you’re ready to begin considering the 
ways in which vectors can be put to use. Among the most useful of all vectors 
are the Cartesian unit vectors, which you can read about in the next section. 


1.2 Cartesian unit vectors 


If you hope to use vectors to solve problems, it’s essential that you learn how to 
handle situations involving more than one vector. The first step in that process 
is to understand the meaning of special vectors called “unit vectors” that often 
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Figure 1.3 Unit vectors in 3-D Cartesian coordinates. 


serve as markers for various directions of interest (unit vectors may also be 
called “versors”). 

The first unit vectors you’re likely to encounter are the unit vectors x, y, z. 
(also called i , j, k) that point in the direction of the x-, y-. and z-axes of the 
three-dimensional Cartesian coordinate system, as shown in Figure 1.3. These 
vectors are called unit vectors because their length (or magnitude) is always 
exactly equal to unity, which is another name for “one.” One what? One of 
whatever units you’re using for that axis. 

You should note that the Cartesian unit vectors t, j, k can be drawn at any 
location, not just at the origin of the coordinate system. This is illustrated in 
Figure 1 .4. As long as you draw a vector of unit length pointing in the same 
direction as the direction of the (increasing) x-axis, you’ve drawn the t unit 
vector. So the Cartesian unit vectors show you the directions of the x, y, and z 
axes, not the location of the origin. 

As you’ll see in Chapter 2, unit vectors can be extremely helpful when doing 
certain operations such as specifying the portion of a given vector pointing in 
a certain direction. That’s because unit vectors don’t have their own magnitude 
to throw into the mix (actually, they do have their own magnitude, but it is 
always one). 

So when you see an expression such as “5 you should think “5 units along 
the positive x -direction.” Likewise, —3 j refers to 3 units along the negative 
y-direction, and k indicates one unit along the positive z-direction. 

Of course, there are other coordinate systems in addition to the three perpen- 
dicular axes of the Cartesian system, and unit vectors exist in those coordinate 
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Figure 1.4 



systems as well; you can see some examples in Section 1.5. One advantage 
of the Cartesian unit vectors is that they point in the same direction no matter 
where you go; the x-, y-, and z-axes run in straight lines all the way out to 
infinity, and the Cartesian unit vectors are parallel to the directions of those 
lines everywhere. 

To put unit vectors such as f, j, k to work, you need to understand the 
concept of vector components. The next section shows you how to represent 
vectors using unit vectors and vector components. 


1.3 Vector components 

The unit vectors described in the previous section are especially useful when 
they become part of the “components” of a vector. And what are the compo- 
nents of a vector? Simply stated, they are the pieces that can be used to make 
up the vector. 

To understand vector components, think about the vector A shown in 
Figure 1.5. This is a bound vector, anchored at the origin and extending to 
the point (x = 0, y = 3, z = 3) in a three-dimensional Cartesian coordinate 
system. So if you consider the coordinate axes as representing the corner of a 
room, this vector is embedded in the back wall (the yz plane). 

Imagine you’re trying to get from the beginning of vector A to the end - the 
direct route would be simply to move in the direction of the vector. But if you 
were constrained to move only in the directions of the axes, you could get from 
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z 



(a) 


z 



(b) 


Figure 1.5 Vector A and its components. 


the origin to your destination by taking three (unit) steps along the y-axis, then 
turning 90° to your left, and then taking three more (unit) steps in the direction 
of the z-axis. 

What does this little journey have to do with the components of a vector? 
Simply this: the lengths of the components of vector A are the distances you 
traveled in the directions of the axes. Specifically, in this case the magnitude of 
the y-component of vector A (written as A v ) is just the distance you traveled 
in the direction of the y-axis (3 units), and the magnitude of the /-component 
of vector A (written as A z ) is the distance you traveled in the direction of the 
z-axis (also 3 units). Since you didn’t move at all in the direction of the x-axis, 
the magnitude of the x-component of vector A (written as A x ) is zero. 

A very handy and compact way of writing a vector as a combination of 
vector components is this: 


A — A x i + Ay] + A z k, (1.1) 

where the magnitudes of the vector components (A x , A y , and A z ) tell you how 
many unit steps to take in each direction (i,j, and k ) to get from the beginning 
to the end of vector A. 1 ' 

When you read about vectors and vector components, you’re likely to run 
across statements such as “The components of a vector are the projections of 
the vector onto the coordinate axes.” As you can see in Chapter 4, exactly 
how those projections are made can have a significant influence on the nature 
of the components you get. But in Cartesian coordinate systems (and other 

6 Some authors refer to the magnitudes A x , Ay, and A z as the “components of A,” while others 
consider the components to be A x i, A y j, and A z k. Just remember that A x , Ay, and A z are 
scalars, but A x i, A y j, and A z k are vectors. 
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(a) 



Figure 1.6 Vector components as projections onto x- and y-axes. 

“orthogonal” systems in which the axes are perpendicular to one another), the 
concept of projection onto the coordinate axes is unambiguous and may be 
very helpful in picturing the components of a vector. 


To understand how this works, take a look at vector A and the light sources 


and shadows in Figure 1.6. As you can see in Figure 1.6(a), the direction of the 
light that produces the shadow on the x-axis is parallel to the y-axis (actually 
antiparallel since it’s moving in the negative y-direction), which in this case is 
the same as saying that the direction of the light is perpendicular to the x-axis. 

Likewise, in Figure 1.6(b), the direction of the light that produces the 
shadow on the y-axis is antiparallel to the x-axis, which is of course perpendic- 
ular to the y-axis. This may seem like a trivial point, but when you encounter 
non-orthogonal coordinate systems, you’ll find that the direction parallel to 
one axis is not necessarily perpendicular to another axis, which gives rise to 
an entirely different type of vector component. This simple fact has profound 
implications for the behavior of vectors and tensors for observers in different 
reference frames, as you’ll see in Chapters 4, 5, and 6. 

No such issues arise in the two-dimensional Cartesian coordinate system 
shown in Figure 1.6, and in this case the magnitudes of the components of 
vector A are easy to determine. If the angle between vector A and the positive 
x-axis is 6, as shown in Figure 1.6a, it’s clear that the length of A can be seen 
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as the hypotenuse of a right triangle. The sides of that triangle along the x- and 
y-axes are the components A x and A y . Hence by simple trigonometry you can 
write: 


A x = |A|cos(0), 
A y = | A| sin (6»), 


( 1 - 2 ) 


where the vertical bars on each side of A signify the magnitude (length) 
of vector A. Notice that so long as you measure the angle 6 from the 


positive x-axis in the direction toward the positive y-axis (that is, counter- 
clockwise in this case), these equations will give the correct sign for the x- and 
y-components no matter which quadrant the vector occupies. 


For example, if vector A is a vector with a length of 7 meters pointing in a 


direction 210° counter-clockwise from the +x-axis, the x- and y-components 
are given by Eq. 1.2 as 


A x = | A | cos(<9) = 7m cos210° = —6.1 m, 
A y = |A| sin(0) = 7m sin 210° = -3.5 m. 


(1.3) 


As expected for a vector pointing down and to the left from the origin, both 
components are negative. 

It’s equally straightforward to find the length and direction of a vector if 
you’re given the vector’s Cartesian components. Since the vector forms the 
hypotenuse of a right triangle with sides A x and A y , the Pythagorean theorem 
tells you that the length of A must be 



(1.4) 


and from trigonometry 



0 = arc tan 


(1.5) 


where 9 is measured counter-clockwise from the positive x-axis in a right- 
handed coordinate system. If you try this with the components of vector A 
from Eq. 1.3 and end up with a direction of 30° rather than 210°, remember 
that unless you have a four-quadrant arctan function on your calculator, you 
must add 180° to the angle whenever the denominator of the expression ( A x in 
this case) is negative. 

Once you have a working understanding of unit vectors and vector compo- 
nents, you’re ready to do basic vector operations. The entirety of Chapter 2 is 
devoted to such operations, but two of them are needed for the remainder of this 
chapter. For that reason, you can read about vector addition and multiplication 
by a scalar in the next section. 


1.4 Vector addition and multiplication by a scalar 1 1 

1.4 Vector addition and multiplication by a scalar 

If you’ve read the previous section on vector components, you’ve already seen 
two vector operations in action. Those two operations are the addition of vec- 
tors and multiplication of a vector by a scalar. Both of these operations are 
used in the expansion of a vector in terms of vector components as in Eq. 1 . 1 
from Section 1.3: 

A = A x i ~t~ A v y A z k. 

In each of these terms, the unit vector (f, j, or k) is being multiplied by a 
scalar (A x , A y , or A.), and you already know the effect of that: it produces 
a new vector, in the same direction as the unit vector, but longer than unity 
by the value of the component (or shorter if the magnitude of the component 
is between zero and one). So multiplying a vector by any positive scalar does 
not change the direction of the vector, but only scales the length of the vector. 
Hence, 5A is a vector in exactly the same direction as A, but with length five 
times that of A, as shown in Figure 1.7(a). Likewise, multiplying A by (1/2) 
produces a vector that points in the same direction as A but is only half as long. 
So the vector component A x i is a vector in the i direction, but with length A x 
units (since i has a length of one unit). 

There is a caveat that goes with the “changes length, not direction” rule 
when multiplying a vector by a scalar: if the scalar is negative, then the 
vector is reversed in direction in addition to being scaled in length. Thus 
multiplying vector B by —2 produces the new vector —2 B, and that vector 
is twice as long as B and points in the opposite direction to B, as shown in 
Figure 1.7(b). 

The other operation going on in Eq. 1 . 1 is vector addition, and you already 
have an idea of what that means if you recall Figure 1.5 and the process of 
getting from the beginning of vector A to the end. In that process, the quantity 



Figure 1.7 Multiplication of a vector by a scalar. 
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A y j represented not only the number of steps you took, but also the direction 
in which you took them. Likewise, the quantity A z k represented the number 
of steps you took in a different direction. The fact that these two quanti- 
ties include directional information means that you cannot simply add them 
together algebraically; you must add them “as vectors.” 

To accomplish vector addition graphically, you simply imagine moving one 
vector (without changing its length or direction) so that its tail is at the head 
of the other vector. The sum is then determined by making a new vector that 
begins at the start of the first vector and terminates at the end of the second 
vector. You can do this graphically, as in Figure 1.5(b), where the tail of vector 
A-k is placed at the head of vector A y j, and the sum is the vector from the 
beginning of A y j to the end of A z k. 

This graphical “head-to-tail” approach to vector addition works for any vec- 
tors (and any number of vectors), not just two vectors that are perpendicular to 
one another (as A v j and A-k were). An example of this is shown in Figure 1.8. 
To graphically add the two vectors A and B in Figure 1.8(a), you simply imag- 
ine moving one of the two vectors so that its tail is at the position of the other 
vector’s head (it doesn’t matter which vector you choose to move; the result 
will be the same). This is illustrated in Figure 1.8(b), in which vector B has 
been displaced so that its tail is at the head of vector A. The sum of these two 
vectors (called the “resultant” vector C = A + B) is the vector that extends 
from the beginning of A to the end of B . 

Knowing how to add vectors graphically means you can always determine 
the sum of two or more vectors simply using a ruler and a protractor; just draw 
the vectors head-to-tail (being careful to maintain each vector’s length and 




Figure 1.8 Graphical addition of vectors. 
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angle), sketch the resultant from the beginning of the first to the end of the last, 
and then measure the length (using the ruler) and angle (using the protractor) 
of the resultant. This approach can be both tedious and inaccurate, so here’s 
an alternative approach that uses the components of each vector: if vector C 
is the sum of two vectors A and B, then the magnitude of the x-component of 
vector C (which is just C x ) is the sum of the magnitudes of the x-components 
of vectors A and B (that is, A x + B x ), and the magnitude of the y-component 
of vector C (called C y ) is the sum of the magnitudes of the ^-components of 
vectors A and B (that is, A y + B y ). Thus 


C x — A x + B x , 

Cy = Ay + By. 


( 1 . 6 ) 


The rationale for this is shown in Figure 1 .9. 

Once you have the components C x and C y of the resultant vector C, you can 
find the magnitude and direction of C using 

|C| = ^C? + C_2 (1.7) 

and 

Q = arctan (1.8) 

To see how this works in practice, imagine that vector A in Figure 1 .9 is given 
by A = 6i + j and vector B is given by B = —2; + 8j. To add these two 
vectors algebraically, you simply use Eqs. 1.6: 




Figure 1.9 Component addition of vectors. 
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c x = A x + B x = 6 + (-2) = 4, 

Cy = Ay + By = 1+8 = 9, 

so C = 4/ + 9/. If you wish to know the magnitude of C, you can just plug 
the components into Eq. 1 .7 to get 

\c\ = yjc 2 x + C 2 y = V4 2 + 9 2 
= V16 + 81 = 9.85. 


And the angle that C makes with the positive x-axis is given by Eq. 1.8: 


9 = arctan 

= arctan 



66 . 0 °. 


With the basic operations of vector addition and multiplication of a vector 
by a scalar in hand, you’re ready to begin thinking about the more advanced 
uses of vectors. But you’re also ready to attack a variety of problems involving 
vectors, and you can find a set of such problems at the end of this chapter. 7 


1.5 Non-Cartesian unit vectors 

The three straight, mutually perpendicular axes of the Cartesian coordinate sys- 
tem are immensely useful for a variety of problems in physics and engineering. 
Some problems, however, are much easier to solve in other coordinate systems, 
often because the axes of those systems more closely align with the directions 
over which one or more of the parameters relevant to the problem remain con- 
stant or vary in a predictable manner. The unit vectors of such non-Cartesian 
coordinate systems are the subject of this section, and transformations between 
coordinate systems are discussed in Chapter 4. 

As described earlier, it takes exactly N numbers to unambiguously represent 
any location in a space of N dimensions, which means you have to specify 
three numbers (such as x, y, and z) to designate a location in our Universe 
of three spatial dimensions. However, on the two-dimensional surface of the 
Earth (ignoring height variation for the moment) it takes only two numbers 
(latitude and longitude, for example) to designate a specific point. And one of 
the few benefits to living on a long, infinitely thin island is that you can set 

7 Remember that full solutions are available on the book's website. 
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up a rendezvous using only a single number to describe the location (“I’ll be 
waiting for you at 3.75 kilometers”). 

Of course, numbers define locations only after you’ve defined the coordi- 
nate system that you’re using. For example, do you mean 3.75 kilometers from 
the east end of the island or from the west end? In every space of 1, 2, 3, or 
more dimensions, you can devise an infinite number of coordinate systems to 
specify locations in that space. In each of those coordinate systems, at each 
location there’s one direction in which one of the coordinates is increasing 
the fastest, and if you lay a vector with length of one unit in that direction, 
you’ve defined a coordinate unit vector for that system. So in the Cartesian 
coordinate system, the i unit vector shows you the direction in which the 
x-coordinate increases, the j unit vector shows you the direction in which the 
y-coordinate increases, and the k unit vector shows you the direction in which 
the ^-coordinate increases. Other coordinate systems have their own coordinate 
unit vectors, as well. 

Consider the two-dimensional coordinate systems shown in Figure 1.10. In 
a two-dimensional space, you know that it takes two numbers to specify any 
location, and those numbers could be x and y, defined along two straight axes 
that intersect at a right angle. The x value tells you how far you are to the right 
of the y-axis (or to the left if the x value is negative), and the y value tells 
you how far you are above the x-axis (or below if the y value is negative). 
But you could equally well specify any location in this two-dimensional space 
by noting how far and in what direction you’ve moved from the origin. In the 
standard version of these “polar” coordinates, the distance from the origin is 



Figure 1.10 2-D rectangular (a) and polar (b) coordinates. 
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called r and the direction is specified by giving the angle 9 measured counter- 
clockwise from the positive x-axis. 

It’s easy enough to figure out one set of coordinates if you know the others; 
for example, if you know the values of x and y, you can find r and 9 using 


r = 



9 = arctan ^ . 

Likewise, if you have the values of r and 0, you can find x and y using 


(1.9) 


x — r cos (9) 
y — r sin(<9). 


(1.10) 


For the point shown in Figure 1 . 10, if the values of x and y are 4 cm and 9 cm, 
then r has a value of approximately 9.85 cm and 9 has a value of 66.0° . Clearly, 
whether you write (x, y) = (4 cm, 9 cm) or (r, 9) = (9.85 cm, 66.0°), you’re 
referring to the same location; it’s not the point that’s changed, it’s only the 
point’s coordinates that are different. 

And if you choose to use the polar coordinate system to represent the point, 
do unit vectors exist that serve the same function as i and j in Cartesian coor- 
dinates? They certainly do, and with a little logic you can figure out which 
direction they must point. After all, you know that the unit vector i shows you 
the direction of increasing x and the unit vector j shows you the direction of 
increasing y, but now you’re using r and 9 instead of x and y. So it seems 
reasonable that the unit vector r at any location should point in the direction of 
increasing r, and the unit vector 0 should point in the direction of increasing 9. 
For the point shown in Figure 1.10, that means that r should point up and to the 
right, in the direction of increasing r if 9 is held constant. At that same point, 
9 should point up and to the left, in the direction of increasing 0 if r is held 
constant. These polar unit vectors are shown for one point in Figure 1.10(b). 

An important consequence of this definition is that the directions of r and 0 
will be different at different locations. They’ll always be perpendicular to one 
another, but they will not point in the same directions as they do for the point 
in Figure 1.10. The dependence of the polar unit vectors on position can be 
seen in the following relations: 


r = cos (9)i + sin(0) j 
9 = — sin(0)f + cos(0) j . 


(1.11) 


So if 9 =0 (which means your location is on the +x-axis), then r = i and 
9 = j. But if 9 = 90° (so your location is on the +y-axis), then r = j and 

0 = -i. 
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Does this dependence on position mean that these unit vectors are not “real” 
vectors? That depends on your definition of a real vector. If you define a vector 
as a quantity with magnitude and direction, the polar unit vectors do meet 
your definition. But they do not meet the definition of free vectors described in 
Section 1.1, since they may not be moved without changing their direction. 

This means that if you express a vector in polar coordinates and then 
take the derivative of that vector, you’ll have to account for the change in 
the unit vectors, as well. That’s one of the advantages offered by Cartesian 
coordinates - the unit vectors do not change no matter where you go in the 
space. 

As you might expect, the situation is slightly more complicated for three- 
dimensional coordinate systems. Whether you choose to use Cartesian or 
non-Cartesian coordinates, you’re going to need three variables to represent all 
the possible locations in a three-dimensional space, and each of the coordinates 
is going to come with its own unit vector. The two most common three- 
dimensional non-Cartesian coordinate systems are cylindrical and spherical 
coordinates, which you can see in Figures 1.11 and 1.12. 

In cylindrical coordinates a point P is specified by r,<p, z, where r (some- 
times called p) is the perpendicular distance from the z-axis, (j) is the angle 
measured from the x-axis to the projection of r onto the xy plane, and z is the 
same as the z in Cartesian coordinates. Here’s how you find r, <p, and z if you 
know x, y, and z: 



Figure 1.11 Cylindrical coordinates. 
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Figure 1.12 Spherical coordinates. 


r = yjx 2 + y 1 

<p = arctan ^ (1-12) 

z = z. 

And if you have the values of r, (p, and z, you can find x, y, and z using 

x = r cos (<p) 

y = r sin (</>) (1.13) 

z = Z. 

A vector at the point P is specified in cylindrical coordinates in terms of 
three mutually perpendicular components with unit vectors perpendicular to 
the cylinder of radius r, perpendicular to the plane through the z-axis at angle 
(p, and perpendicular to the xy plane at distance z. As in the Cartesian case, 
each cylindrical coordinate unit vector points in the direction in which that 
parameter is increasing, so r points in the direction of increasing r, (p points 
in the direction of increasing <p, and z points in the direction of increasing z. 
The unit vectors (r, (p,z) form a right-handed set, so if you point the fingers 
of your right hand along r and push it into (p with your right palm, your right 
thumb will show you the direction of },■ 
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The following equations relate the Cartesian to the cylindrical unit vectors: 
r — cos (</>)? + sin(0) j 

0 = — sin(0)? + cos(0)y (1.14) 

z = z. 


In spherical coordinates a point P is specified by r, 0, 0 where r represents 
the distance from the origin, 9 is the angle measured from the "-axis toward 
the xy plane, and 0 is the angle measured from the x-axis (or xz plane) to the 
constant-0 plane containing point P. With the z-axis up, 9 is sometimes called 
the zenith angle and 0 the azimuth angle. You can determine the spherical 
coordinates r, 9, and 0, from x, y, and z using the following equations: 


r = J x - + y- + z 


— . / V 2 i „2 i 7 2 


= arccos 


^ s/x 2 + y 2 + z 2 , 

0 = arctan ^ . 


And you can find x, y, and z from r, 9, and 0 using: 


x — r sin(0) cos (0) 
y — r sin(<9) sin(0) 
Z — r cos (9). 


(1-15) 


(1.16) 


In spherical coordinates, a vector at the point P is specified in terms of three 
mutually perpendicular components with unit vectors perpendicular to the 
sphere of radius r, perpendicular to the plane through the z-axis at angle 0, 
and perpendicular to the cone of angle 9. The unit vectors (r, 9, 0) form a 
right-handed set, and are related to the Cartesian unit vectors as follows: 

r = sin(0) cos (0)f + sin(0) sin(0) j + cos (9)k 
9 = cos (0) cos (0)f + cos(0) sin(0) j — sin(0)£ (1.17) 

0 = — sin(0)? + cos(0) j. 

You may be asking yourself “Do I really need all these different unit vec- 
tors?” Well, need may be a bit strong, but your life will certainly be easier 
if you’re trying to describe motion along a line of constant longitude on a 
spherical planet (the 9 direction) or the direction of a magnetic field around a 
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current-carrying wire (the (j> direction). You’ll find some examples of that in 
the problems at the end of this chapter. 


1.6 Basis vectors 

If you think about the unit vectors i, j, and k and vector components such as 
A x i, A y j, and A z k, you may realize that any vector in our three-dimensional 
Cartesian coordinate system can be made up of three components, each one 
telling you how many steps to take in the direction of one of the coordi- 
nate axes. Since those steps may be large or small, in the positive or negative 
direction, you can reach any point in the space containing these vectors. Little 
wonder, then, that i, j, and k are one example of “basis vectors” in this space; 
combined with appropriate magnitudes, they form the basis of any vector in 
the space. 

And you don’t need to use only these particular vectors to make up any 
vector in this space - you can easily imagine using three vectors that are twice 
as long as the unit vectors i, j, and k, as shown in Figure 1.13(a). Although 
the vector components would change if you switched to these longer basis 
vectors, you’d have no trouble using them to make up any vector within the 
space. Specifically, if the unit vectors were twice as long, the values of A x , A y , 
and A. would have to be only half as big to reach a given point in space. 

You might even think of using three non-orthogonal, non-unit vectors such 
as the vectors ?i, and <?3 in Figure 1.13(b) as your basis vectors. Of course, 
if you were to select three coplanar vectors (that is, vectors lying in the same 
plane), you’d quickly find that scaling and combining those vectors allows you 



Figure 1.13 Alternative basis vectors. 
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to reach any point within that plane, but all points outside the plane would be 
unreachable. But as long as one of the three vectors is not coplanar with the 
other two, then appropriate scaling and combining will get you to any point 
in the space, and the vectors e\, e 2 , and e 3 form a perfectly usable basis set 
(mathematicans say that they “span” the vector space). 

You can ensure that three vectors are not coplanar by requiring them to be 
“linearly independent,” which means that no two of the vectors may be scaled 
and combined to give the third, and no two are collinear (that is, lying along 
the same line or parallel to one another). This is often stated as the requirement 
that the only way to scale and combine the three vectors and get zero as the 
result is to scale each of the vectors by zero. In other words, for three linearly 
independent vectors e\, e 2 , and e 3 , the equation 

Ae\ + Be 2 + Ce 3 — 0 (1.18) 

can only be true if A = B = C = 0. 

So as long as you pick three linearly independent vectors, you have a viable 
set of basis vectors. And if you choose three non-coplanar vectors e\, e 2 , and 
e 3 of non-unit length, it’s quite simple to form unit vectors from these vec- 
tors. Since dividing a vector by a positive scalar changes its length but not its 
direction, you simply divide each vector by its magnitude: 


A 

O 

et = 

lol 

A 

e 2 

ei = 

\h\ 

A 

h 

e 3 = 

\h\ 


The concepts described in this section may be used to construct an infinite 
number of bases, but the most common are the “orthonormal” bases such as 
i, j, and k. These bases are called “ortho” because they’re orthogonal (per- 
pendicular to one another) and “normal” because they are normalized to a 
magnitude of one. Orthonormal bases will get you through the majority of 
problems you’re likely to face. 

One last fact about basis vectors in various coordinate systems will serve you 
very well if you study physics and engineering beyond the basic level, espe- 
cially if your studies include the tensors discussed in Chapters 4 through 6. 
That fact is this: basis vectors that point along the axes of one coordi- 
nate system may be described in another coordinate system using partial 
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derivatives. Specifically, imagine that you’re converting from spherical to 
rectangular coordinates. The basis vector along the original spherical (r) axis 
can be written in the Cartesian ( x , y, and z) system as 

_ 3x „ dy „ 3 z 

e r = — i H / H k 

dr 8 r J dr 

= sin 0 cos 4> i + sin 9 sin </> j + cos 9 k. 

Likewise, the eg and e ( p basis vectors can be written as 

_ 3x „ dy A 3 z 

eg = — i -f- — / T — k 

30 dO J 3 e 

= r cos 9 cos (f> i + r cos 9 sin </> j — r sin 9 k, 

_ dx „ dy „ dz f 

0 dtp d<p J dtp 

= —r sin 9 sin (p i + r sin 9 cos (p j. 


Notice that these basis vectors are not all unit vectors (because their magni- 
tudes are not all equal to one), nor do they all have the same dimensions ( e r 
is dimensionless, but eg and e,p have dimensions of length). Neither of these 
characteristics disqualifies these as basis vectors, and you can always turn them 
into unit vectors by dividing by their magnitudes (take a look at the problems 
at the end of this chapter and their on-line solutions if you want to see how this 
works). 

In general, if the coordinates of the original system are called xi, X 2 , and xj, 
(these were r, 9, and (p in the example just discussed), and the coordinates of 
the new system are called xj , x' 2 , and x^ (these were x, y, and z in the example), 
then the basis vectors along the original coordinate axes can be written in the 
new system as 


e\ 


dx[ 
dx\ 

- _ 

e2 dx 2 £l 
_ 3-tC _ 

" 3= d^ e ^dt/ 2 ' 


e t 


dx 2~> 
~dxi S 2 

X ^ 

-^e' 2 - 

3x2 

3x, 


3xi 6 3 ’ 

—e' 3 , 

dxi 

^.-5. 

3x 3 


(1.20) 


Q x f ^ dx' — ♦ dx' — » 

In other words, the partial derivatives jjpr-e , ^e 2 , and j^-e ' 3 are the compo- 
nents of the first original (unprimed) basis vector expressed in the new (primed) 


If you’re not familiar with partial derivatives or need a refresher, you’ll find one in the next 
chapter. 


1.7 Chapter 1 problems 
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coordinate system. For this reason, you’ll find that some authors define basis 
vectors in terms of partial derivatives. 

These relationships will prove to be extremely valuable in the study of 
coordinate-system transformation and tensor analysis, so file them away if your 
studies include those topics. 


1.7 Chapter 1 problems 

1.1 (a) If |B| = 18 m and B points along the negative x-axis, what are B x 

and B v ? 

(b) If C x = —3 m/s and C y = 5 m/s, find the magnitude of C and the 
angle that C makes with the positive x-axis. 

1 .2 Vector A has magnitude of 1 1 m/s 2 and makes an angle of 65 degrees 
with the positive x-axis, and vector B has Cartesian components B x = 4 
m/s 2 and B y = —3 m/s 2 . If vector C = A + B, 

(a) Find the x- and y-components of C; 

(b) What are the magnitude and direction of C? 

1.3 Imagine that the y-axis points north and the x-axis points east. 

(a) If you travel a distance r = 22 km in a straight line from the origin in a 
direction 35 degrees south of west, what is your position in Cartesian 
(x, y) coordinates? 

(b) If you travel 6 miles due south from the origin and then turn west and 
travel 2 miles, how far from the origin and in what direction is your 
final position? 

1.4 What are the x- and y-components of the polar unit vectors r and 0 when 

(a) 6=1 80 degrees? 

(b) 6 = 45 degrees? 

(c) 0 = 215 degrees? 

1.5 Cylindrical coordinates 

(a) If r = 2 meters, <p = 35 degrees, and z = 1 meter, what are x, y, and 

z? 

(b) If (x, y, z) = (3, 2, 4) meters, what are (r, <f>, z)l 

1.6 (a) In cylindrical coordinates, show that r points along the x-axis 

if </> = 0. 

(b) In what direction is </> if </> = 90 degrees? 

1.7 (a) In spherical coordinates, find x, y, and z if r = 25 meters, 6 = 35 

degrees, and < p= 1 10 degrees. 

(b) Find (r, 9, </>) if (x, y, z) = (8, 10, 15) meters. 
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1.8 (a) For spherical coordinates, show that 8 points along the negative 

Z-axis if 8 = 90 degrees. 

(b) If 0 also equals 90 degrees, in what direction are r and 0? 

1.9 As you can read in Chapter 3, the magnetic field around a long, straight 

wire carrying a steady current I is given in spherical coordinates by the 
expression B = where /iq is a constant and R is the perpendicular 

distance from the wire to the observation point. Find an expression for B 
in Cartesian coordinates. 

1.10 If e\ = 5f — 3j + 2k, e 2 — )— 3 k, and ej, = 2i + j — 4k, what are the 
unit vectors e\, e 2 , and £ 3 ? 
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Vector operations 


If you were tracking the main ideas of Chapter 1, you should realize that 
vectors are representations of physical quantities - they’re mathematical tools 
that help you visualize and describe a physical situation. In this chapter, you 
can read about a variety of ways to use those tools to solve problems. You’ve 
already seen how to add vectors and how to multiply vectors by a scalar (and 
why such operations are useful); this chapter contains many other “vector oper- 
ations” through which you can combine and manipulate vectors. Some of these 
operations are simple and some are more complex, but each will prove useful 
in solving problems in physics and engineering. The first section of this chapter 
explains the simplest form of vector multiplication: the scalar product. 


2.1 Scalar product 

Why is it worth your time to understand the form of vector multiplication 
called the scalar or “dot” product? For one thing, forming the dot product 
between two vectors is very useful when you’re trying to find the projection 
of one vector onto another. And why might you want to do that? Well, you 
may be interested in knowing how much work is done by a force acting on an 
object. The first instinct of many students is to think of work as “force times 
distance” (which is a reasonable starting point). But if you’ve ever taken a 
course that went a bit deeper than the introductory level, you may remember 
that the definition of work as force times distance applies only to the special 
case in which the force points in exactly the same direction as the displacement 
of the object. In the more general case in which the force acts at some angle to 
the direction of the displacement, you have to find the component of the force 
along the displacement. That’s one example of exactly what the dot product 
can do for you, and you’ll find more in the problems at the end of this chapter. 
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How do you go about computing the dot product between two vectors? Well, 
if you know the Cartesian components of each vector (call the vectors A and 
B), you can use 

AoB — A X B X + A y By + A z B z . (2.1) 

Or if you know the angle 6 between the vectors, 

A o B = |A||Z?| cos0, (2.2) 

where | A | and | B | represent the magnitude (length) of the vectors A and B . ! 
Note that the dot product between two vectors gives a scalar result (just a single 
value, no direction). 

To grasp the physical significance of the dot product, consider vectors A 
and B which differ in direction by angle 6, as shown in Figure 2.1a. For these 
vectors, the projection of A onto the direction of B is |A| cos(0), as shown in 
Figure 2.1b. Multiplying this projection by the length of B gives | A\\B\ cos(0). 
Thus the dot product AoB represents the projection of A onto the direction of 
B multiplied by the length of IS. The scalar result of this operation is exactly 
the same as the result of finding the projection of B onto the direction of A 
and then multiplying that value by the length of A. Hence the order of the 
two vectors in the dot product is irrelevant; AoB gives the same result as 
Bo A. 

The scalar product can be particularly useful when one of the vectors in 
the product is a unit vector. That’s because the length of a unit vector is by 
definition equal to one, so a scalar product such as A o k finds the projection of 
vector A onto the direction of k (the ^-direction) multiplied by the magnitude of 
k (which is one). Thus to find the component of any vector in a given direction, 
you can simply form the dot product between that vector and the unit vector in 




— > 
B 


The projection of A onto B: \A\ cos/9 
— > — > 
times the length of B: xjfll 


gives the dot product A°B: l/1l/ilcos/9 


(a) (b) 

Figure 2. 1 Two vectors and their scalar product. 

1 The equivalence between Equations 2. 1 and 2.2 is demonstrated in the problems at the end of 
this chapter. 
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the desired direction. It’s quite likely you’ll come across problems in physics 
and engineering in which you have a vector (A) and you wish to know the 
component of that vector that’s perpendicular to a specified surface; if you 
know the unit normal vector (h) for the surface, the scalar product Aon gives 
you that perpendicular component of A. 

The scalar product is also useful in finding the angle between two vectors. 
To understand how that works, consider the two expressions for the dot product 
given in Eqs. 2.1 and 2.2. Since 


A o B = \ A\\B\ cos 9 = A X B X + A y B y + A Z B Z , (2.3) 

then dividing both sides by the product of the magnitudes of A and B gives 


cos(0) = 


= arccos 


1 y^y 


A Z B Z 


\A\\B\ 

A X B X + A y B y + A - B z 


|A||fi| 


(2.4) 


So if you wish to find the angle between two vectors A = 5? — 2j + 4k and 
B = 3? + / + Ik, you can use Eq. 2.4 to find 


9 — arccos 


= arccos 
= 37.3°. 


(5)(3) + (— 2)(1) + (4) (7) 


, V ( 5) 2 + (~ 2) 2 + ( 4 ) V ( 3) 2 + ( l ) 2 + ( 7 ) 2 , 
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One final note about the scalar product: any unit vector dotted with itself 
gives a result of 1 (since, for example, i o i = |?| |?| cos(0°) = (1)(1)(1) = 1), 
and the dot product between two different orthogonal unit vectors gives a result 
of zero (since, for example, i o j = |f||J| cos(90°) = (1)(1)(0) = 0). 


2.2 Cross product 

Another way to multiply two vectors is to form the “cross product” between 
them. Unlike the dot product, which gives a scalar result, the cross prod- 
uct results in another vector. Why bother learning this form of vector 
multiplication? One reason is that the cross product is just what you need when 
you’re trying to find the result of certain physical processes, such as applying a 
force at the end of a lever arm or firing a charged particle into a magnetic field. 
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Computing the cross product between two vectors is only slightly more com- 
plicated than finding the dot product. If you know the Cartesian components 
of both vectors, the cross product is given by 


A x B = ( A y B z — A : B y )i 

+ (A Z B X — A x B z )j 
+ (A x By — A y B x )k, 


which can be written as 


Ax B = 


i j k 
A x A y A z 
B x By B z 


(2.5) 


(2.6) 


If you haven’t seen determinants before and you need some help getting from 
Eq. 2.6 to Eq. 2.5, you can find an explanation of how this works on the book’s 
website. 

The direction of the vector formed by the cross product of A and B is 
perpendicular to both A and B (that is, perpendicular to the plane contain- 
ing both A and B ), as shown in Figure 2.2. Of course, there are two directions 
perpendicular to this plane, so how do you know which one corresponds to 
the direction of A x B1 The answer is provided by the “right-hand rule,” 
which you can invoke by opening your right hand and making your thumb 
perpendicular to the direction of your fingers in the plane of your palm. Now 
imagine using your right palm and fingers to push the first vector (A in this 
case) into the direction of the second vector ( B in this case) through the 
smallest angle. As you push, your thumb shows you the direction of the cross 
product. 

A very important difference between the dot product and the cross product is 
that the order of the vectors is irrelevant for the dot product but matters greatly 
for the cross product. You can see this by imagining the cross product B x A in 
Figure 2.2. In order to push vector B into vector A with your right palm, you’d 
have to turn your hand upside-down (that is, with your thumb pointing down). 
And since your thumb shows you the direction of the cross product, you can 
see that B x A points in the opposite direction from A x B. That means that 


A x B = -B x A, (2.7) 

- Some people find it easier to imagine aligning the fingers of your (open) right hand with the 
direction of the first vector, and then curling your fingers toward the second vector. Or you can 
point your right index finger in the direction of the first vector and your right middle finger in 
the direction of the second vector. Whether you use the pushing, curling, or pointing approach, 
your right thumb shows you the direction of the cross product. 
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Figure 2.2 Direction of the cross product A x B. 


The length of the cross product 
equals the area of the parallelogram 
formed by vectors A and B 



Plane containing both A and B y 
Figure 2.3 The cross product as area. 


since the negative of a vector is just a vector of the same magnitude in the 
opposite direction. A quick method of computing the magnitude of the cross 
product is to use 

|A x B\ = | A||Z?| sin($), (2.8) 

where | A| is the magnitude of A, \ B\ is the magnitude of B, and 6 is the angle 
between A and B? 

One way to picture the length and direction of the cross product is illustrated 
in Figure 2.3. Just as the dot product involves the projection of one vector onto 
another, the cross product also has a geometrical interpretation. In this case, 
the magnitude of the cross product between two vectors is proportional to the 
area of the parallelogram formed with those two vectors as adjacent sides. As 
you may recall, the area of a parallelogram is just its base times its height, and 

' The equivalence of Eq. 2.8 and the magnitude of the expression in Eq. 2.5 is demonstrated in 
the problems at the end of this chapter. 
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in this case the height of the parallelogram is 1 6| sin(0) and the length of the 
base is |A|. That makes the area of the parallelogram equal to |A||.B| sin(0), 
exactly as given in Eq. 2.8. 

So if the angle between two vectors A and B is zero or 1 80° (that is, if A 
and B are parallel or antiparallel), the cross product between them is zero. And 
as the angle between A and B approaches 90° or 270°, the magnitude of the 
cross product increases, reaching a maximum value of ] A] | B | when the vectors 
are perpendicular. 

Using the definition of the cross product and the right-hand rule, you should 
be able to convince yourself that the following relations are true: 


ixi= 0 



j x i = —k 

kx j = -i (2.9) 

t x k = — j. 


1*1 = 0 

k x k = 0 


Applying these relations term-by-term to the product of A = A x i + A y j + A z k 
and B = B x i + B y j + B z k should help you understand where Eqs. 2.6 and 
2.5 come from (and if you need some help making that work out, there’s a 
problem on this at the end of this chapter, with the full solution on the book’s 
website). 

Applications of the cross product include torque problems (in which r = 
r x F) and magnetic force problems (in which F % = qv x By, you can find 
examples of these in the chapter-end problems. 


2.3 Triple scalar product 


Once you understand the dot product and cross product described in the previ- 
ous two sections, you may be wondering if it’s possible to combine these two 
vector operations. Happily, it’s not only possible, it’s actually useful to do so. 
After all, you can define all the mathematical operations you’d like, but unless 
those operations result in something that you can apply to solve problems, 
you’d have to leave them in the “curiosity” file. You’ve seen how the dot prod- 
uct finds employment when projections of vectors onto specified directions are 
needed and when work is to be calculated, and how the cross product can be 
called into action when torques and magnetic forces are at play. But does it 
make sense to combine the dot and cross product operations in a manner such 
as Ao(BxC)? Yes it does. 4 This is called the “triple scalar product” or “scalar 
triple product” and it has several useful applications. 

4 But (AoS)xC makes no sense, since (A o B) gives a scalar, and you can’t cross that scalar 


into C. 
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The mathematics of this operation are straightforward; you know that 

B x C = ( B y C z — B z C y )i 

+ (B Z C X — B x C z )j 

+ (. B x C y - B y c x )k, (2.10) 


and from Eq. 2. 1 you also know that 

A o B — A X B X A v B v -(- A z B z , 


so combining the dot and cross product gives 

Ao|ixcj= A x (ByC z — B z Cy) 

+ Ay(B z C x — B X C Z ) 
+ A z (B x C y — ByC X ). 


A handy way to write this is 



A x A y A, 

By By B z 

C X Cy C Z 


( 2 . 11 ) 


(2.12) 


One geometrical interpretation of the triple scalar product can be under- 
stood with the help of Figure 2.4. In this figure, vectors A, B, and C represent 
the sides of a parallelepiped. The area of the base of this parallelepiped is 
| B x C |, as in Figure 2.3, and its height is equal to A | cos(</;), where 0 
is the angle between A and the direction of B x C. That means that the 
volume of the parallelepiped (the height times the area of the base) must be 
| A | cos(0)(|B x C |). Writing this as |A||B x C \ cos (0) should help you see 
that this has the same form as the definition of the dot product in Eq. 2.2 and 
is therefore just A o (B x C). 



Figure 2.4 The triple scalar product as volume. 
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Hence the triple scalar product Ao(fixC) may be interpreted as the volume 
of the parallelepiped formed by vectors A, B, and C. You should note that the 
triple product will give a positive result so long as the vectors A, B, and C 
form a right-handed system (that is, pushing A into B with the palm of your 
right hand gives a direction onto which C projects in a positive sense (likewise 
for pushing B into C and pushing C into A). 

Seeing the relationship between the triple scalar product of three vectors and 
the volume formed by those vectors makes it easy to understand why the triple 
scalar product may be used as a test to determine whether three vectors are 
coplanar (that is, whether all three lie in the same plane). Just imagine how 
the parallelepiped in Figure 2.4 would look if vectors A, B, and C were all 
in the same plane. In that case, the height of the parallelepiped would be zero 
and the projection of A onto the direction of B x C would be zero, which 
means the triple product A o (B x C) would have to be zero. Stated another 
way, if the projection of A onto the direction of B x C is not zero, then A 
cannot lie in the same plane as B and C. Thus 

A o (B x C) = 0 (2.13) 

is both a necessary and a sufficient condition for vectors A, B. and C to be 
coplanar. 

Equating A o (B x C) to the volume of the parallelepiped formed by vectors 
A, B, and C should also help you see that any cyclic permutation of the vectors 
(such as B o (C x A) or C o (A x B)) gives the same result for the triple 
scalar product, since the volume of the parallelepiped is the same in each of 
these cases. Some authors describe this as the ability to interchange the dot 
and the cross without affecting the result (since (A x B) o C is the same as 
C o (A x B)). 

One application in which the triple scalar product finds use is the determi- 
nation of reciprocal vectors, as explained in the sections in Chapter 4 dealing 
with covariant and contravariant components of vectors. 


2.4 Triple vector product 

The triple scalar product described in the previous section is not the only use- 
ful way to multiply three vectors. An operation such as A x (B x C) (called 
the “triple vector product”) comes in very handy when you’re dealing with 
certain problems involving angular momentum and centripetal acceleration. 
Unlike the triple scalar product, which produces a scalar result (since the sec- 
ond operation is a dot product), the triple vector product yields a vector result 
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(since both operations are cross products). You should note that A x (B x C) 
is not the same as (A x B) x C; the location of the parentheses matters greatly 
in the triple vector product. The triple vector product is somewhat tedious to 
calculate by brute force, but thankfully a simplified expression exists: 

A x (B x C) = B(A o C) — C(A o B). (2.14) 

After all the previous discussion of the various ways in which vectors can be 
multiplied, you can be forgiven for thinking that the right side of this equation 
looks a bit strange, with no circle or cross between B and A o C or between C 
and A o B. Just remember that AoC and Ao B are scalars, so the expressions 
in parentheses in Eq. 2.14 are simply scalar multipliers of vectors B and C. 
Does this mean that the result of the operation A x (B x C) is a vector that is 
some linear combination of the second and third vectors in the triple product? 
That’s exactly what it means, as you can see by considering Figure 2.5. 

In this figure, you can see the vector BxC pointing straight up, perpendic- 
ular to the plane containing vectors B and C. Now imagine forming the cross 
product of vector A with vector BxC by pushing A into the direction of BxC 
with the palm of your right hand. The result of this operation, labelled vector 
A x (B x C), is back in the plane containing vectors IS and C. To understand 
why this is true, consider the fact that the vector that results from the operation 
BxC must be perpendicular to the plane containing B and C. If you now 
cross A into that vector, the resulting vector must be perpendicular to both A 
and to ( B x C), which puts it back in the plane containing vectors IS and C. 
And if the vector result of the operation A x (B x C) is in the same plane as 
vectors B and C, then it must be a linear combination of those two vectors. 

You can remember Eq. 2.14 as the “BAC minus CAB” rule so long as you 
remember to write the members of the triple product in the correct sequence 


BxC 
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(A, B . C) with the parentheses around the last two vectors. To see where this 
comes from, you can simply use the definition of the cross product (Eq. 2.6) to 
write 


Ax (B xC) = 


i ] k 

Ax A y A z 

(B x C), (B x C) y (B x C) z 


And from Equation 2.5 you know that 


B X C = (ByC Z — B ~ C y ) i 

+ (B Z C X — B x C z )j 
+ ( B x C y — B y C x )k. 


(2.15) 


(2.16) 


Substituting these terms into Eq. 2.15 gives 

Ax (B x C) = 


i J 

A X Ay 

(By Cr z — B z Cy) ( B Z C X — B X C Z ) 


k 

Az 

(B X Cy ~ ByC X ) 

(2.17) 


Multiplying this out looks ugly at first: 

A x (B x C) = [A y (B x Cy — B V C X ) — A Z (B~C X — B x C z )]i 

T \A z (ByC Z — B z Cy) — A x (B x Cy — ByC X )]j 
+ [A X (B-C X - B X C Z ) - Ay(B y C z - B z Cy)]k. (2.18) 

But a little rearranging gives 

A x (B x C) = (AyCy + A z C z )(B x i) — ( A Y B V + A z B-)(C x i) 

+ (A Z C Z + A x C x )(B y j ) — (A Z B Z + A x B x )(C y j) 

+ (A X C X + A y Cy)(B z k) - ( A X B X + AyBy)(C z k), (2.19) 


which still isn’t pretty, but it does hold some promise. That promise can be 
realized by adding nothing to each row of Eq. 2.19. Nothing, that is, in the 
following form: 


A x B x C x (i) - A x B x C x (i) 

Ay ByCy(j) ~ AyByCy(j) 

A z B-C z (k) - A z B z C z (k) 


Add this to the top row; 
Add this to the middle row; 
Add this to the bottom row. 
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These additions make Eq. 2.19 a good deal more friendly: 
ix(ixC) 

= (A* C x + AyCy + A z C z )(B x i) — ( A X B X + A y B y + A- B z )(C x i) 

+ (A X Cy + AyCy + A Z C Z )(Byj) ~ (A X By + AyBy + A Z B Z )(Cyj) 

+ ( A X C X + AyCy + A Z C Z )(B Z k) — ( A X B X + AyBy + A z B z )(C z k). 
Or 

A x (B x C ) = (A X C X + AyCy + A z C z )(B x i + + B z k) 

— ( A X B X + AyBy + A z B z ){C x i + C y j + C z k). 

But B x i + By] + B z k is just the vector B, C x i + C y j + C-k is the vector C, 
and the other two terms fit the definition of dot products (Eq. 2.1). Thus 

A x (B x C) — (A o C)B - (A o B)C 
= B{A o C) - C(A o B). 


2.5 Partial derivatives 


Once you understand the basic vector operations of dot, cross, and triple 
products, it’s a small step to more advanced vector operations such as gradient, 
divergence, curl, and the Laplacian. But these are differential vector operations, 
so before you can make that step, it’s important for you to understand the dif- 
ference between ordinary derivatives and partial derivatives. This is worth your 
time and effort because differential vector operations have many applications 
in diverse areas of physics and engineering. 

You probably first encountered ordinary derivatives when you learned how 
to find the slope of a line (m = or how to determine the speed of an 
object given its position as a function of time ( v x = ifr). Happily, partial 
derivatives are based on the same general concepts as ordinary derivatives, but 
extend those concepts to functions of multiple variables. And you should never 
have any doubt as to which kind of derivative you’re dealing with, because 
ordinary derivatives are written as or and partial derivatives are written as 
A or A 

8x U1 3 f 

As you may recall, ordinary derivatives come about when you’re interested 
in the change of one variable with respect to another. For example, you may 
encounter a variable y which is a function of another variable x (which means 
that the value of y depends on the value of x). This can be written as y = fix), 
where y is called the “dependent variable” and x is called the “independent 
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variable.” The ordinary derivative of y with respect to x (written as 4^) tells 
you how much the value of y changes for a small change in the variable x. If 
you make a graph with y on the vertical axis and x on the horizontal axis, as 
in Figure 2.6, then the slope of the line between any two points (jci , yi) and 
(X 2 , yi) on the graph is simply = That’s because the slope is defined 
as “the rise over the run,” and since the rise is Ay for a run Ax, the slope of 
the line between any two points must be ^ . 

But if you look closely at the expanded region of Figure 2.6, you’ll notice 
that the graph of y versus x has a slight curve between points (xi, yi) and 
fe, y2), so the slope is actually changing in that interval. Thus the ratio ^ 
can’t represent the slope everywhere between those points. Instead, it rep- 
resents the average slope over this interval, as suggested by the dashed line 
between points (xi, yi) and (x 2 , y 2 ) (which by the mean value theorem does 
equal the slope somewhere in between these two points, but not necessarily 
in the middle). To represent the slope at a given point on the curve more pre- 
cisely, all you have to do is to allow the “run” Ax to become very small. As 
Ax approaches zero, the difference between the dashed line and the curved 
line in Figure 2.6 becomes negligible. If you write the incremental run as dx 
and the (also incremental) rise as dy, then the slope at any point on the line can 
be written as . This is the reasoning that equates the derivative of a function 
to the slope of the graph of that function. 

Now imagine that you have a variable z that depends on two other variables, 
say x and y, so z = /(x, y). One way to picture such a case is to visualize a 
surface in three-dimensional space, as in Figure 2.7. The height of this surface 
above the xy plane is z, which gets higher and lower at different values of 
x and y. And since the height z may change at a different rate in different 
directions, a single derivative will not generally be sufficient to characterize 
the total change in height as you move from one point to another. You can see 



Figure 2.6 Slope of the line y = f(x). 
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Figure 2.7 Surface in 3-D space ( z = fix , y)). 


is not very steep 

Slope along y-direction 
is quite steep 

-> 

y 

X 

Figure 2.8 Surface in 3-D space (z = fix, y)). 




the height z changing at different rates in Figure 2.8; at the location shown in 
the figure, the slope of the surface is quite steep if you move in the direction of 
increasing y (while remaining at the same value of x), but the slope is almost 
zero if you move in the direction of increasing x (while holding your y-value 
constant). 

This illustrates the usefulness of partial derivatives, which are derivatives 
formed by allowing one independent variable (such as x or y in Figure 2.8) 
to change while holding other independent variables constant. So the partial 
derivative represents the slope of the surface at a given location if you 
move only along the x-direction from that location, and the partial derivative 0 
represents the slope if you move only along the y-direction. You may find these 
partial derivatives written as ||| y and f^U, where the variables that appear in 
the subscript after the vertical line are held constant. 
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As you’ve probably already guessed, the change in the value of z as either 
x or y changes is easily found using partial derivatives. If only x changes, 
dz = || dx, and if only y changes, then dz = dy . And if both x and y 
change, then 

dz dz 

dz = —dx + — dy. (2.20) 

dx dy 

The process of taking a partial derivative of a given function is quite straight- 
forward; if you know how to take ordinary derivatives, you already have the 
tools you’ll need to take partial derivatives. Simply treat all variables (with 
the exception of the one variable over which the derivative is being taken) as 
constants, and take the derivative as you normally would. This is best explained 
using an example. 

Consider a function such as z = /(x, y) — 6x 2 y + 3x + 5xy + 10. The terms 
of this polynomial are sufficiently complex to make its shape less than obvious, 
which is where a computational tool such as Mathematica or MATLAB can 
be very handy. Writing a few lines of code will help you understand how this 
function behaves, as you can see in Figure 2.9. Even a quick look at this warped 
little plane makes it clear that the slope of the function is quite different in the 
x- and y-directions, and the slope is also highly dependent on the location 
on the surface. In a 3D plot such as Figure 2.9, it’s always easiest to see the 
slope at the edges of the plotted region, so take a look at the slope along the 
x-direction for a y value of —3. As x varies from —3 to +3 (while y is held 
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Figure 2.9 Plot of the function z = /(x, y) = 6x 2 y + 3x + 5xy + 10 
for —3 < x < 3 and — 3 < v < 3. 
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constant at —3), the slope starts off positive and gets less steep as you move 
in the +x -direction from x — — 3 toward x = 0. The slope then becomes zero 
somewhere near x = 0, then turns negative and becomes increasingly steep as 
x approaches +3. Doing the same quick analysis along the v -direction while 
holding x constant at —3 indicates that the slope is approximately constant and 
positive as y varies from —3 to +3. 

Now that you have some idea of what to expect, you can take the partial 
derivative of z = 6 x 2 y + 3x + 5xy + 10 with respect to x simply by treating 
the variable y as a constant: 

= I2xy + 3 + 5y. (2.21) 

dx 

Likewise, the partial derivative with respect to y is found by holding x 
constant: 

3 z t 

— = 6x 2 + 5x. (2.22) 

3 y 


Before interpreting these derivative results, you may want to take a moment 
to make sure you understand why the process of taking the derivative of a 
function involves bringing down the exponent of the relevant variable and then 
subtracting one from that exponent (so = 2x, for example). The answer 
is quite straightforward. Since the derivative represents the change in the func- 
tion z as the independent variable x changes over a very small run, the formal 
definition for this derivative can be written as 

z(x + Ax) — z(x) 


I™ 

dx Ax-»0 
So in the case of z — x 2 , you have 
d (x 2 ) 


dx 


= lim 

Ax-i-O 


Ax 


(x + Ax)“ — x 2 
Ax 


(2.23) 


(2.24) 


If you think about the term in the numerator, you’ll see that this is x 2 + 2x Ax + 
(Ax) 2 — x 2 , which is just 2xAx + (Ax) 2 , and dividing this by Ax gives 
2x + Ax. But as Ax approaches zero, the Ax term becomes negligible, and 
this approaches 2x. So where did the 2 come from? It’s just the number of 
cross terms (that is, terms with the product of x and Ax) that result from rais- 
ing (x + Ax) to the second power. Had you been taking the derivative of x 3 
with respect to x, you would have had three such cross terms. So you bring 
down the exponent because that’s the number of cross terms that result from 
taking x + Ax to that power. And why do you then subtract one from the 
exponent? Simply because when you take the change in the function z (that is, 
(x + Ax) 2 — x 2 ), the highest-power terms (x 2 in this case) cancel, leaving only 
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terms of one lower power ( x 1 in this case). It’s a bit laborious, but the same 
analysis can be applied to show that = 3x 2 and that = nx n ~ l . 

So that’s why you bring down the exponent and subtract one, but what does 
it mean when you take derivatives and get answers such as Eqs. 2.21 and 2.22? 
It simply means that the slope varies with direction and location on the sur- 
face z. So, for example, the slope along the x-direction at location (—3,2) is 
12xy + 3 + 5y = 12(— 3)(2) + 3 + 5(2) = —59, while at the same location the 
slope along the v-direction is 6x 2 + 5x = 6[(— 3) 2 ] + 5(— 3) = 39. 

You can do a rough check on your calculated partial derivative in Eq. 2.21 
by inserting the value of —3 for y to see that the slope of z at this value of y is 
12(x)(— 3) + 3 + 5 ( — 3) = — 36x — 12. Thus as you move in the x-direction 
at y = —3, the slope should vary from +96 at x = —3, to zero at x = —1/3, 
and down to — 120 at x = +3. This is consistent with the quick analysis of the 
slope after Figure 2.9. 

Likewise, Eq. 2.22 tells you that the slope of z in the y-direction at x = — 3 is 
constant and positive, also consistent with the behavior expected from a quick 
analysis of the shape of the function z. 

And just as you can take “higher order” ordinary derivatives such as 
^4 and j-(^) = jMj, you can also take higher-order partial deriva- 
tives. So for example ^ (||) = |-| tells you the change in the x-direction slope 

of z as you move along the x-direction, and J^(||) = |A tells you the change 
in the v-direction slope as you move along the v-direction. 

■) 2 _ 

It’s important for you to realize that an expression such as is the 
derivative of a derivative, which is not the same as (|f ) 2 , which is the square 
of a first derivative. That’s easy to verify for the example given above, in 
which || = 12xy + 3 + 5y. In that case, = 12y, whereas (|f) 2 = 
(12xy + 3 + 5y) 2 . By convention the order of the derivative is always written 
between the “<?” or “3” and the function, as drz or d 2 z, so be sure to look 
carefully at the location of superscripts when you’re dealing with derivatives. 

You may also have occasion to use “mixed” partial derivatives such as 
44 (|y) = 57^4 • If you’ve been tracking the discussion of partial derivatives as 

slopes of functions in various directions, you can probably guess that rep- 
resents the change in the y-direction slope as you move along the x-direction, 
and represents the change in the x-direction slope as you move along 
the y-direction. Thankfully, for well-behaved^ functions these expressions are 

5 What exactly is a “well-behaved” function? Typically this means any function that is 
continuous and has continuous derivatives over the region of interest. 
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interchangeable, so you can take the partial derivatives in either order. You can 
easily verify this for the example given above by comparing Jh of Eq. 2.21 
with of Eq. 2.22 (the result is 12x + 5 in both cases). 

There’s another widely used aspect of partial derivatives you should make 
sure you understand, and that’s the chain rule. Up to this point, we’ve been 
dealing with functions such as z = f(x, y ) without considering the fact that 
the variables x and y may themselves be functions of other variables. It’s com- 
mon to call these other variables u and v and to allow both x and y to depend 
on one or both of u and v. You may encounter situations in which you know 
the variation in u and v, and you want to know how much your function z will 
change due to those changes. In such cases, the chain rule for partial derivatives 
gives you the answer: 


3 z 3 z dx 3 z 3 y 
du dx 3 u + dy dll' 


(2.25) 


and 

dz dz dx dz dy 
dv dx dv dy dv 


(2.26) 


The chain rule is a concise expression of the fact that z depends on both x 
and y, and since both x and v may change if u changes, the change in z with 
respect to u is the sum of two terms. The first term is the change in x due to 
the change in u (|^) times the change in z due to that change in x (ff ), and the 
second term is the change in y due to the change in u (1^) times the change 
in z due to that change in y (ip). Adding those two terms together gives you 
Eq. 2.25, and the same reasoning applied to changes in z caused by changes in 
v leads to Eq. 2.26. 


2.6 Vectors as derivatives 

In many texts dealing with vectors and tensors, you’ll find that vectors are 
equated to “directional derivatives” and that partial derivatives such as ^ and 
jp are referred to as basis vectors along the coordinate axes. 

To understand this correspondence between vectors and derivatives, con- 
sider a path such as that shown in Figure 2.10. You can think of this as a 
path along which you’re travelling with velocity v ; for simplicity imagine 
that this path lies in the xy plane. Now imagine that you’re keeping track of 
time as you move, so you assign a value (such as the t values shown in the 
figure) to each point on the curve. By marking the curve with values, you have 
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Figure 2.10 Parameterized curve and tangent vectors. 


“parameterized” the curve (with t as your parameter). 6 Note that there need 
not be equal distance along the curve between your parameter values (there 
definitely won’t be if you choose time as your parameter and then change your 
speed as you move; the reckless driver depicted in Figure 2.10 has apparently 
sped up in the turn). 

As a final bit of visualization, imagine that this curve lies in a region in which 
the air temperature is different at each location. So as you move along the 
curve, you will experience the spatial change in air temperature as a temporal 
change (in other words, you’ll be able to make a graph of air temperature vs. 
time). Of course, how fast the air temperature changes for you will depend 
both on the distance between measurable changes in the temperature in the 
direction you’re heading and on your speed (how fast you’re covering that 
distance). 

With this scenario in mind, the concept of a directional derivative is easy 
to understand. If the function f(x,y) describes the temperature at each x, y 
location, the directional derivative ( d / t ) tells you how much the value of the 
function / changes as you move a small distance along the curve (in time dt). 
But recall the chain rule: 


df dx df dy df 
dt dt dx dt dy 


(2.27) 


This equation says simply that the directional derivative of the function / along 
the curve parameterized by t (that is, -Jj) equals the rate of change of the 
x-coordinate (4^) as you move along the curve times the rate of change of the 
temperature function with x (Mf) plus the rate of change of the y-coordinate 
(^) as you move along the curve times the rate of change of the temperature 


6 Some authors are careful to distinguish between a “path” and a “curve,” using “curve” only 
when a parameter has been assigned to each point on a path. 
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function with y (|£). But (^) is just d*, the x-component of your velocity, and 
(^j) is v y , the y-component of your velocity. And since you know that your 
velocity is a vector that is always tangent to the path on which you’re moving, 
you can consider the directional derivative yy- to be a vector with direction 
tangent to the curve and with length equal to the rate of change of / with t 
(that is, the time rate of change of the air temperature). 

Now here’s the important concept: since / can be any function, you can 
write Eq. 2.27 as an “operator” equation (that is, an equation waiting to be fed 
a function on which it can operate): 

d dx d dy d 

— = + — — . (2.28) 

dt dt dx dt dy 

The trick to seeing the connection between derivatives and vectors is to view 
this equation as a vector equation in which 

Vector = x-component • x basis vector + y-component • y basis vector. 

Comparing this to Eq. 2.28, you should be able to see that the directional 
derivative operator ^ represents the tangent vector to the curve, the ^ and 
terms represent the x- and y-components of that vector, and the operators 
and represent the basis vectors in the direction of the x and y coordinate 
axes. 

Of course, it’s not just air temperature that can be represented by /(x, y); 
this function can represent anything that is spatially distributed in the region 
around your curve. So f(x, y) could represent the height of the road, the qual- 
ity of the scenery, or any other quantity that varies in the vicinity of your curve. 
Likewise, you could have chosen to parameterize your path with markers other 
than time; had you assigned a value s or X to each point on your path, the 
directional derivative Jy or -j- would still represent the tangent vector to the 
curve, or ^ would still represent the x-component of that vector, and 
or would still represent the y-component of that vector. 

If you plan to proceed on to the study of tensors, you will find that under- 
standing this relationship between basis vectors along the coordinate axes and 
partial derivatives is of significant value. 


2.7 Nabla - the del operator 

The partial derivatives discussed in the previous section can be put to use in 
a wide range of problems, and when you come across such problems you 
may find that they involve equations that contain an inverted upper case delta 
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wearing a vector hat (V). This symbol represents a vector differential operator 
called “nabla” or “del,” and its presence instructs you to take derivatives of the 
quantity on which the operator is acting. The exact form of those derivatives 
depends on the symbol following the del operator, with V ( ) signifying gradi- 
ent, “Vo” signifying divergence, “Vx” indicating curl, and V 2 () signifying 
the Laplacian. Each of these operations is discussed in later sections; for now 
we’ll just consider what an operator is and how the del operator can be written 
in Cartesian coordinates. 

Like all good mathematical operators, del is an action waiting to happen. 
Just as tells you to take the square root of anything that appears under its 
roof, V is an instruction to take derivatives in three directions. Specifically, in 
Cartesian coordinates 


- „ 9 „ 9 f 9 

V = i 1- j \-k — , 

dx J dy 9 Z 


(2.29) 


where i, j, and k are the unit vectors in the direction of the Cartesian 
coordinates x, y, and z. 

This expression may appear strange, since in this form it’s lacking anything 
on which it can operate. However, if you follow the del with a scalar or vector 
field, you can extract information about how those fields change in space. In 
this context, “field” refers to an array or collection of values defined at various 
locations. A scalar field is specified entirely by its magnitude at these locations: 
examples of scalar fields include the air temperature in a room and the height 
of terrain above sea level. A vector field is specified by both magnitude and 
direction at various locations: examples include electric, magnetic, and gravi- 
tational fields. Specific examples of how the del operator works on scalar and 
vector fields are given in the following sections. 


2.8 Gradient 

When the del operator V is followed by a scalar field, the result of the opera- 
tion is called the gradient of the field. What does the gradient tell you about a 
scalar field? Two important things: the magnitude of the gradient indicates how 
quickly the field is changing over space, and the direction of the gradient indi- 
cates the direction in which the field is increasing most quickly with distance. 
So although the gradient operates on a scalar field, the result of the gradient 
operation is a vector, with both magnitude and direction. Thus, if the scalar 
field represents terrain height, the magnitude of the gradient at any location 
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tells you how steeply the ground is sloped at that location, and the direction of 
the gradient points uphill along the steepest slope. 

The definition of the gradient of the scalar field i jr in Cartesian coordinates is 

grad(i/0 = Vf = i^r + (Cartesian). (2.30) 

Thus the x-component of the gradient of i// indicates the slope of the scalar 
field in the x-direction and the other components indicate the slope in the other 
directions. The square root of the sum of the squares of these components 
provides the total steepness of the slope at the location at which the gradient is 
taken. 

You can see a simple example of the result of the gradient operator by 
considering the tilted plane in Figure 2.1 1(a). This plane is defined by the sim- 
ple equation i js(x, y ) = 5x + 2 y, and you can find the gradient using the 
two-dimensional version of Eq. 2.30: 

„3(5x + 2y) ,9(5x + 2 y) 

5 ^ J 5 

ox dy 

= 5 1+ 2 j. 

So even though i/r is a scalar function, its gradient is a vector; it has a com- 
ponent along the x-axis and a component along the y-axis. And what do these 
components tell you? 

For one thing, the fact that the x-component is more than twice the size of 
the y-component tells you that the tilt of the plane is steeper in the x-direction 
than in the y-direction. You can also tell that the slope in each direction is 
constant, because the components are not functions of x or y. Both of those 
conclusions are consistent with Figure 2.1 1(a). 


Side view Top view 



Figure 2.1 1 Function x[r = 5x + 2y and the gradient and contours of i jr. 
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And if you wish to determine the magnitude of the gradient, that’s easily 
done. Since the x-component of the gradient is 5 and the y-component is 2, the 
magnitude of the gradient is simply (5 2 + 2 2 ) 1 - /2 = 5.39 over the entire plane. 
You can also find the angle that the gradient vectors make with the positive 
x-axis using arctan(2/5) = 21.8°. The gradient and contours of the central 
portion of the function t// are shown in Figure 2.1 1(b). 

In cylindrical and spherical coordinates, the gradient is: 

= r'jF +Zj^ (cylindrical), (2.31) 

and 

= (spherical). (2.32) 

You’ll see more gradients in Section 2.11 covering the Laplacian opera- 
tor, which represents the divergence of the gradient. You can read about the 
divergence in the next section. 


2.9 Divergence 

When dealing with vector fields, you may encounter the del operator followed 
by a dot (Vo), signifying the divergence of a vector field. The concept of diver- 
gence often arises in the areas of physics and engineering that deal with the 
spatial variation of vector fields, because divergence describes the tendency 
of vectors to “flow” into or out of a point of interest. 7 Electrostatic fields, for 
example, may be represented by vectors that point radially away from points 
at which positive electric charge exists, just as the flow vectors of a fluid point 
away from a source (such as an underwater spring). Likewise, electrostatic field 
vectors point toward locations at which negative charge is present, analogous 
to fluid flowing toward a sink or drain. It was the brilliant Scottish mathemat- 
ical physicist James Clerk Maxwell who coined the term “convergence” for 
the mathematical operation which measures the rate of vector “flow” toward 
a given location. In modern usage we consider the opposite behavior (vectors 
flowing away from a point), and outward flow is considered positive diver- 
gence. In the case of fluid flow, the divergence at any point is a measure of the 
tendency of the flow vectors to diverge from that point (that is, to carry more 
material away from it than toward it). Thus points of positive divergence mark 
the location of sources, while points of negative divergence show you where 
the sinks are located. 

7 In many instances, nothing in the vector field is actually flowing; the word "flow” is used only 
as an analogy in which the arrows pointing in the direction of the field are imagined to 
represent the physical flow of an incompressible fluid. 
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To understand how this works, take a look at the vector fields shown in 
Figures 2.12 and 2.13. To find the locations of positive divergence in each of 
these fields, look for points at which the flow vectors either spread out or are 
larger pointing away from the location and shorter pointing toward it. Some 
authors suggest that you imagine sprinkling sawdust on flowing water to assess 
the divergence; if the sawdust is dispersed, you have selected a point of positive 
divergence, while if it becomes more concentrated, you’ve picked a location of 
negative divergence. 

Using such tests, it’s clear that locations such as 1 and 2 in Figure 2.12 
and locations 4 and 5 in Figure 2.13(a) are points of positive divergence (flow 
away from these points exceeds flow toward), while the divergence is negative 
at point 3 in Figure 2.12 (flow toward exceeds flow away). 

The divergence at various points in Figure 2.13(b) is less obvious. Location 
6 is obviously a point of positive divergence, but what about locations 7 and 8? 
The flow lines are clearly spreading out at those locations, as they do at location 
5 in Figure 2.13(a), but they’re also getting shorter pointing away. Does the 
spreading out compensate for the slowing down of the flow? 



I 1 1 * 

0 1/2 1 x 


Figure 2.12 Parallel vector field with varying amplitude. 



Figure 2.13 Radial vector fields with varying amplitudes. 
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Answering that question requires a useful mathematical form of the diver- 
gence as well as a description of how the vector field varies from place to place. 
The differential form of the mathematical operation of divergence or “del dot” 
(Vo) on a vector A in Cartesian coordinates is 

VoA = + y — + ° + kA z ^j , (2.33) 


and, since i oi — jo j = k ok = 1, this is 


Vo A = 



dAy + 3A,\ 
dy dz )' 


(2.34) 


Thus the divergence of A is simply the change in its x-component along the 
x-axis plus the change in its y-component along the y-axis plus the change in 
its z-component along the z-axis. Notice that the divergence of a vector field is 
a scalar quantity; it has magnitude but no direction. 

You can now apply this to the vector field in Figure 2.12. In Figure 2.12, 
assume that the magnitude of the vector field varies sinusoidally along the x- 
axis as A = sinfzrx)? while remaining constant in the y- and z-directions. 
Thus, 

- - dA x 

V o A = = 7t cos(7rx), (2.35) 

3x 

since A v and A- are zero. This expression is positive for 0 < x < 1/2, 0 at 
x — 1/2, and negative for 1/2 < x < 3/2, just as a visual inspection suggests. 

Now consider Figure 2.13(a), which represents a slice through a spherically 
symmetric vector field with amplitude increasing as the square of the distance 
from the origin. Thus A = r 2 r. Since r 2 — (x 2 + y 2 + Z 2 ) and 


„ _ xi + yj + zk 
v / x 2 + y 2 + z 2 

this means 


7 2 - , 2 , 2 , 2 , x ' + yj + zk 

A = r r — (x +y +7 ) — = 

y/x l + y z + z~ 

= (x 2 + y 2 + Z 2 ) 1/2 (xi + yj + zk), 


and 


3 A* 

dx 


u 2 + , 2 + z 2 ) 2 ' 2 + * (i) u 2 + r + i 2 r l/2 (2*>. 
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Doing likewise for the y- and ’-components and adding yields 


V o A = 3(x 2 + y 2 + z 2 ) 1/2 + 


(x 2 + y 2 + z 2 ) 
\Jx 2 + y 2 + Z 2 


= 4(x 2 + y 2 + z 2 ) 1/2 = 4r. 


Thus the divergence in the vector field in Figure 2.13(a) is increasing linearly 
with distance from the origin. 

Finally, consider the vector field in Figure 2.13(b), which is similar to the 
previous case but with the amplitude of the vector field decreasing as the 
square of the distance from the origin. The flow lines are spreading out as they 
were in Figure 2.13(a), but in this case you might suspect that the decreas- 
ing amplitude of the vector field will affect the value of the divergence. Since 
A = (1/ r 2 )r. 


A = 


xi + yj + zk 


(x 2 + y 2 + z 2 ) y/ x 2 + y 2 + z 2 


xi + yj + zk 
(x 2 + y 2 + £ 2 )( 3 / 2 ) ’ 


and 


dA x 

dx 


1 

(x 2 + y 2 + z 2 ) 3 / 2 



(x 2 + y 2 + z 2 )“ 5/2 ( 2x). 


Adding in the y- and "-derivatives gives 

V o A = 1 3(x 2 + y 2 + z 2 ) 

(x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 5 / 2 

This validates the suspicion that the reduced amplitude of the vector field with 
distance from the origin may compensate for the spreading out of the flow 
lines. Note that this is true only for the case in which the amplitude of the 
vector field falls off as 1/r 2 (and only for points away from the origin). 8 
Therefore, you must consider two key factors in determining the divergence 
at any point: the spacing and the relative amplitudes of the field lines at that 
point. These factors both contribute to the total flow of field lines into or out of 
an infinitesimally small volume around the point. If the outward flow exceeds 
the inward flow, the divergence is positive at that point. If the outward flow is 
less than the inward flow, the divergence is negative, and if the outward and 
inward flows are equal the divergence is zero at that point. 

So far the divergence has been calculated for the Cartesian coordinate sys- 
tem, but depending on the symmetries of the problem, it might be solved 


At the origin, where r = 0, a (I //--(-vector field experiences a singularity, and the Dirac delta 
function must be employed to determine the divergence. 
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more easily using non-Cartesian systems. The divergence may be calculated 
in cylindrical and spherical coordinate systems using 

19... 1 dA A 3A ? 


V o A = - — (r A r ) H — + ^, (cylindrical) 

r dr r dtp dz 


and 


1 9 


Vo A =- — (r^A r ) 
r z dr 


1 


r sin 9 36 


( A s sint?) + 


1 9A 0 

rsint? 3<p 


(2.36) 

(spherical) 

(2.37) 


If you doubt the efficacy of choosing the proper coordinate system, 
you should re-work the last two examples in this section using spherical 
coordinates. 


2.10 Curl 

The del operator followed by a cross (Vx) signifies the differential operation 
of curl. The curl of a vector field is a measure of the field’s tendency to circulate 
about a point, much like the divergence is a measure of the tendency of the 
field to flow away from a point. But unlike the divergence, which produces 
a scalar result, the curl produces a vector. The magnitude of the curl vector 
is proportional to the amount of circulation of the field around the point of 
interest, and the direction of the curl vector is perpendicular to the plane in 
which the field’s circulation is a maximum. 

The curl at a point in a vector field can be understood by considering the 
vector fields shown in Figure 2.14. To find the locations of large curl in each 
of these fields, look for points at which the flow vectors on one side of the 
point are significantly different (in magnitude, direction, or both) from the 



Figure 2.14 Vector fields with various values of curl. 
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flow vectors on the opposite side of the point. Once again a thought experi- 
ment is helpful: imagine holding a tiny paddlewheel at each point in the flow. 
If the flow would cause the paddlewheel to rotate, the center of the wheel 
marks a point of non-zero curl. The direction of the curl is along the axis of the 
paddlewheel. By convention, the positive-curl direction is determined by the 
right-hand rule: if you curl the fingers of your right hand along the circulation 
direction, your thumb points in the direction of positive curl. 

Using the paddlewheel test, you can see that points 1, 2, and 3 in 
Figure 2.14(a) and point 5 in Figure 2.14(b) are high-curl locations, and some 
curl also exists at point 4. The uniform flow around point 6 and the diverging 
flow lines around Point 7 in Figure 2.14(c) would not cause a tiny paddlewheel 
to rotate, meaning that these are points of low or zero curl. 

To make this quantitative, you can use the differential form of the curl or 
“del cross” (Vx) operator in Cartesian coordinates: 

V x A = (i-d- + j^- + k-^j x (lA x + j A y + kA-'j . (2.38) 

Recall that the vector cross-product may be written as a determinant: 


which expands to 


V x A = 



(2.39) 


V x A = 


fdA z 9A V \ 

V 3 y dz ) 


\ dz ox J \ ox 



(2.40) 


Notice that each component of the curl of A indicates the tendency of the 
field to rotate in one of the coordinate planes. If the curl of the field has a 
large x-component, it means that the field has significant circulation about 
that point in the yz plane. The overall direction of the curl represents the axis 
about which the rotation is greatest, with the sense of the rotation given by the 
right-hand rule. 

If you’re wondering how the terms in this equation measure rotation, 
consider the vector fields shown in Figure 2.15. Look first at the field in 
Figure 2.15(a) and the x-component of the curl in the equation: this term 
involves the change in A- with y and the change in A y with z. Proceeding 
in the positive y-direction from the left side of the point of interest to the right, 
A- is clearly increasing (it’s pointing in the negative z-direction on the left side 
of the point of interest and the positive z-direction on the right side), so the term 
must be positive. Looking now at A y , you can see that it is positive below 
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the point of interest and negative above, so it is decreasing in the positive z- 

dA 

direction. Thus is negative, which means that it increases the value of the 
curl when it is subtracted from '-^y. Thus the curl has a large value at the point 

of interest, as expected in light of the circulation of A about this point. 

dA 

The situation in Figure 2.15(b) is quite different. In this case, both -yy and 
'-^y are positive, and subtracting from ^y gives a small result. The value 
of the x-component of the curl is therefore small in this case. Vector fields with 
zero curl at all points are called “irrotational.” 

Here are expressions for the curl in cylindrical and spherical coordinates: 


V x A 


I dA z dA^\ 

r dtp 3 z J 


/3 A _ 9AA , + l / 3 (rAj) _ 3Ar \ . 
\ d z dr J r \ dr dtp J ’ 

(cylindrical) (2.41) 


1 / 3(A 0 sing) _ 3Ae\ 1 /J_3 Ay _ 3(rA 0 ) \ . 

rsind \ 3 9 dtp ) r \sin0 dtp dr J 

, 1/3 (rAg) dA r \ 2 
r V dr dd ) ^ 

(spherical) (2.42) 


A common misconception is that the curl of a vector field is non-zero wher- 
ever the field appears to curve. However, just as the divergence depended both 
on the spreading out and the changing length of field lines, the curl depends not 
only on the curvature of the lines but also on the strength of the field. Consider 
a curving field that points in the tp direction and decreases as 1/r: 
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A = 


k , 

~ 4 >- 


Finding the curl of this field is particularly straightforward in cylindrical 
coordinates: 


V x A = 


(l M * 

V r dtp 






( d(rA^) 
\ dr 



z. 


Since A r and A z are both zero, this is 


V x A 



1/3 (rA 0 )\ _/ d(k/r)\ 

r V dr ) Z \ dz ) 




To understand the physical basis for this result, consider again the fluid-flow 
and paddlewheel analogy. Imagine the forces on the paddlewheel placed in the 
field shown in Figure 2.16(a). The center of curvature is well below the bottom 
of the figure, and the spacing of the arrows indicates that the field is getting 
weaker with distance from the center. At first glance, it may seem that this 
paddlewheel would rotate clockwise due to the curvature of the field, since the 
flow lines are pointing slightly upward at the left paddle and slightly downward 
at the right. But consider the effect of the weakening of the field above the axis 
of the paddlewheel: the top paddle receives a weaker push from the field than 
the bottom paddle, as shown in Figure 2.16(b). The stronger force on the bot- 
tom paddle will attempt to cause the paddlewheel to rotate counter-clockwise. 
Thus the downward curvature of the field is offset by the weakening of the 
field with distance from the center of curvature. And if the field diminishes 
as 1/r, the upward-downward push on the left and right paddles is exactly 
compensated by the weaker-stronger push on the top and bottom paddles. The 
clockwise and counter-clockwise forces balance, and the paddlewheel does not 
turn - the curl at this location is zero, even though the field lines are curved. 


Upward- 

pointing 

field 



Downward- 

pointing 

field 


Stronger field 
(a) 


Weaker push 
to the right 

Upward push Downward push 



Stronger push 
to the right 


(b) 


Figure 2.16 Offsetting components of the curl of A. 
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For this 1 /r field, the curl is zero everywhere except at the center of curvature 
(where a singularity exists and must be handled using the delta function). 


2.11 Laplacian 


Once you know that the gradient operates on a scalar function and produces a 
vector and that the divergence operates on a vector and produces a scalar, it’s 
natural to wonder whether these two operations can be combined in a mean- 
ingful way. As it turns out, the divergence of the gradient of a scalar function 
</>, written as Vo (Vo/;), is one of the most useful mathematical operations in 
physics and engineering. This operation, usually written as V 2 </> (but some- 
times as A </>), is called the “Laplacian” in honor of Pierre-Simon Laplace, the 
great French mathematician and astronomer. 

Before trying to understand why the Laplacian operator is so valuable, 
you should begin by recalling the operations of gradient and divergence in 
Cartesian coordinates: 


Gradient: 


Divergence: 


-> „3 </> „3 (p ~d<p 

V</> = i (- l (- k — . 

V dx ■' dy dz 


Vo A 


3 A k dA y 3 A z 

1 H -. 

dx dy dz 


( 2 . 43 ) 


( 2 . 44 ) 


Since the x-component of the gradient of (p is the y-component of the gra- 
dient of <p is and the z-component of the gradient of cp is the divergence 
of the vector produced by the gradient is 


V o V<j> = v 2 </> = 


d 2 cp 

dx 2 


dfy_ 

dy 2 


3 2 (f> 
dz 2 


( 2 . 45 ) 


Just as the gradient (V), divergence (Vo), and curl (Vx) represent 
differential operators, so too the Laplacian (V 2 ) is an operator waiting to be 
fed a function. As you may recall, the gradient operator tells you the direc- 
tion of greatest increase of the function (and how steep the increase is), the 
divergence tells you how strongly a vector function “flows” away from a point 
(or toward that point if the divergence is negative), and the curl tells you how 
strongly a vector function tends to circulate around a point. So what does the 
Laplacian, the divergence of the gradient, tell you? 

r\ o2 o2 o2 

If you write the Laplacian operator as V = + jpr + it should 

help you see that this operator finds the change in the change of the function 
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(if you make a graph, the change in the slope) in all directions from the point 
of interest. That may not seem very interesting, until you consider that accel- 
eration is the change in the change of position with time, or that the maxima 
and minima of functions (peaks and valleys) are regions in which the slope 
changes significantly, or that one way to find blobs and edges in a digital 
image is to look for points at which the gradient of the brightness suddenly 
changes. 

To understand why the Laplacian performs such a diverse set of useful tasks, 
it helps to understand that at each point in space, the Laplacian of a function 
represents the difference between the value of the function at that point and 
the average of the values at surrounding points. How does it do that? Consider 
the region around the point labeled (0, 0, 0) in Figure 2.17. The function (f> 
exists in all three dimensions around this region, and the cube is shown only 
to illustrate the location of six points around the central point (0, 0, 0), where 
the value of the function cf> is <po. Notice that there are points in front of and 
behind the central point (along the x-axis), points to the left and right (along 
the y-axis), and points above and below (along the z-axis). To see how the 
change in the change in (f> is related to </>o , consider for now the points along the 
x-axis, as shown in Figure 2.18. Notice that the value of (p at the point in back 
of the central point is labeled (pBack and the value of 4> in front of the central 
point is labeled 4> Front- If eac h of these points is located a distance of Ax 



Figure 2.17 Points surrounding (0, 0, 0) at which tj> = 4>q. 
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from (0, 0, 0), then the partial derivative of 0 at point B can be approximated 
by (0o — 4>Back) / Ax. Likewise, the partial derivative of 0 at point A can be 
approximated by (0 Front ~ 0o)/Ax. 

But the Laplacian involves not just the change in 0, but the change in the 
change of 0. For that, you can write 


JL ( ^ Front ~ ( / > o)/ A - y ~ Ofo ~ 4>Back)/ A* 

9x \dx J Ax 

d 2 0 _ 0f + ( pBack 20o 

9x 2 Ax 2 


And although this might not look very helpful, good things happen when you 
combine this expression with the expression for the two points to the right and 
left of (0,0,0): 

9 2 0 tpRight + (pLeft ~ 200 fr> 

Jy 2 “ Ay 2 ’ 1 n 


and the equation for the points on top and on the bottom of (0, 0, 0): 

9“0 (pTop ~t~ 0 Bottom 200 

dz 2 A zr 


If you pick your locations symmetrically so that Ax = Ay = A z, then these 
three equations together give you the following: 
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d 2 (j) 3 2 (j) 3 2 0 
dx 2 3 v 2 dz 2 

_ 4* Front + <pBack + 4 Right + <pLeft + <pTop + 4> Bottom ~ 600 /|n ^ 

“ ( } 

Using the del-squared notation for the Laplacian and a little rearranging makes 
this 

1 

00 — -^(4 Front + <pBack + 4 Right + 4>Left + 0rop + 0 Bottom) 

= -^3- (00 — (pavg), (2.50) 

where the average value of the function 0 over the six surrounding points is 
4avg = o (4 Front + <pBack + <pRight + 4>Left + 07’o/j + (^Bottom )• 

Equation 2.50 tells you that the Laplacian of a function 0 at any point is 
proportional to the difference between the value of 0 at that point and the aver- 
age value of 0 at the surrounding points. The negative sign in this equation 
tells you that the Laplacian is negative if the value of the function at the point 
of interest is greater than the average of the function’s value at the surround- 
ing points, and the Laplacian is positive if the value at the point of interest is 
smaller than the average of the value at the surrounding points. 

And how does the difference between a function’s value at a point and the 
average value at neighboring points relate to the divergence of the gradient of 
that function? To understand that, think about a point at which the function’s 
value is greater than the surrounding average - such a point represents a local 
maximum of the function. Likewise, a point at which the function’s value is 
less than the surrounding average represents a local minimum. This is the rea- 
son you may find the Laplacian described as a “concavity detector” or a “peak 
finder” - it finds points at which the value of the function sticks above or falls 
below the values at the surrounding points. 

To better understand how peaks and valleys relate to the divergence of the 
gradient of a function, recall that the gradient points in direction of steepest 
incline (or decline if the gradient is negative), and divergence measures the 
“flow” of a vector field out of a region (or into the region if the divergence 
is negative). Now consider the peak of the function shown in Figure 2.19(a) 
and the gradient of the function in the vicinity of that peak, shown in Fig- 
ure 2.19(b). Near the peak, the gradient vectors “flow” toward the peak from 
all directions. Vector fields that converge upon a point have negative diver- 
gence, so this means that the divergence of the gradient in the vicinity of a 
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Side view 
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(a) 


Top view 



Figure 2.19 Function 0 (varying as 1/r) and the gradient and contours of </> 
near the peak. 


Side view Top view 



(a) (b) 


Figure 2.20 Function <p (varying as — 1/r) and the gradient and contours of <j> 
near the bottom of the valley. 


peak will be a large negative number. This is consistent with the conclusion 
that the Laplacian is negative near a function’s maximum point. 

The alternative case is shown in Figures 2.20(a) and 2.20(b). Near the bot- 
tom of a valley, the gradient “flows” outward in all directions, so the divergence 
of the gradient is a large positive number in this case (again consistent with the 
conclusion that the Laplacian of a minimum point is positive). And what is the 
value of the Laplacian of a function away from a peak or valley? The answer to 
that question depends on the shape of the function in the vicinity of the point in 
question. As described in Section 2.9, the value of the divergence depends on 
how strongly the function “flows” away from a small volume surrounding the 
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point of interest. Since the Laplacian involves the divergence of the gradient, 
the question is whether the gradient vectors ‘‘flow” toward or away from the 
point (in other words, whether the gradient vectors tend to concentrate toward 
or disperse away from that point). If the inward flow of gradient vectors equals 
the outward flow, then the Laplacian of the function is zero at that point. But if 
the length and direction of the gradient vectors conspire to make the outward 
flow greater than the inward flow at some point, then the Laplacian is positive 
at that point. 

For example, if you’re climbing out of a circularly symmetric valley with 
constant slope, the gradient vectors are spreading apart without changing in 
length, which means the divergence of the gradient (and hence the Laplacian) 
will have a positive value at that point. But if a different valley has walls for 
which the slope gets less steep (so the gradient vectors get shorter) as you move 
away from the bottom of the valley, it’s possible for the reduced strength of the 
gradient vectors to exactly compensate for the spreading apart of those vectors, 
in which case the Laplacian will be zero. 

To see how this works mathematically, consider a three-dimensional func- 
tion cf> whose value decreases in inverse proportion to the distance r from the 
origin. This function may be written as (p = k/r , where k is just a constant of 
proportionality and r is the distance from the origin. Thus r = (x 2 +y 2 +z 2 ) l,/2 
and <p — k/(x 2 + y 2 + z 2 ) 1 / 2 . You can find the value of the Laplacian for this 
case using Eq. 2.45; the first step is to find the partial derivative of cp with 
respect to x 

d(p —k( 2x) —kx 

dx 2(x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 3 / 2 ’ 

after which you take another partial with respect to x : 

d 2 (p —k / 3\ kx{ 2x) 

dx 2 (x 2 + y 2 + z 2 ) 3 / 2 \2/ (x 2 + y 2 + z 2 ) 5 / 2 

— k 3 kx 2 

(x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 5 / 2 

The same approach for the second-order partials with respect to y and z gives 

d 2 ( p _ -k 3 ky 2 

3 y 2 (x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 5 / 2 ’ 


— k 3kz 2 

(x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 5 / 2 


and 


d 2 (p 
3 z 2 
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Now it’s just a matter of adding all three second-order partials: 

d 2 c/> 3 2 <p d 2 (/> _ —3k 3 k(x 2 + y 2 + z 2 ) 

dx 2 3 y 2 3 z 2 (x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 5 / 2 

-3£ 3/t 

(x 2 + y 2 + z 2 ) 3 / 2 (x 2 + y 2 + z 2 ) 3 / 2 

So for a three-dimensional function with 1/r-dependence, the Laplacian of 
the function is zero everywhere away from the origin. What about at the origin 
itself? That point requires special treatment, since the 1/r-dependence of the 
function becomes problematic at r = 0. That special treatment involves the 
Dirac delta function and integral rather than differential techniques. 

You may occasionally have need to calculate the Laplacian in non-Cartesian 
coordinates. For function \[r, the Laplacian in cylindrical and spherical coordi- 
nates is given by: 


Cylindrical 

2 _ 1 3 / df\ 1 d 2 if 3 2 xlr 

^ r dr \ 3 r ) + r 2 3 <p 2 + 3 z 2 


(2.51) 


Spherical 


V 2 i/f = 


1 3 

r 2 dr 



1 3 

r 2 sin0 3 9 



1 3 2 i/f 

r 2 sin 2 9 3 <p 2 


(2.52) 


2.12 Chapter 2 problems 

2. 1 For vectors A = 3? + 2/ — k and B = j + 4k. find the scalar product 
A o B and the angle between A and B . 

2.2 If vector J = 2i — j + 5k and K = 3i + 2j + k, find the vector L that 

equals the cross product J x K. Also show that L is perpendicular to 

both J and to K . 

2.3 Show that A o B = A X B X + A y B y + A Z B~ = |A||B/cos(0) and that 
\A x B\ = \A\\B/ sin(0). 

2.4 Using the vectors of the previous two problems, find the triple product 
J o (A x B). Compare your answer to (ixA)o B. 

2.5 Using the vectors of Problems 1 and 2, find the triple vector product 
/ x (A x B). Compare your answer to(/xA)xfi and to B x (/ x A). 

2.6 For the function f(x, y) = x 2 + 3 y 2 + 2 xy + 3x + 5, find and 

2.7 If (p = x 2 + y 2 , what is V0 at the position (x,y) = (3 cm, —2 cm)? 

2.8 Find the divergence of the vector field given by C — 5 xyi —3 xj + 5 z 2 k. 
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2.9 What is the curl of the vector field given in the previous problem? 

2.10 Find the Laplacian of the function given in Problem 2.6. 

2.1 1 In mechanics, the work (W) done by a force ( F ) acting over a displace- 
ment (dr) is defined as the scalar product between the force and the 
displacement, so W — F o dr. How much work is done by the vertically 
downward force of Earth’s gravity (|F| = mg, where g is the acceler- 
ation of gravity) on a car with a mass of 1200 kg as the car moves 50 
meters down a hill whose surface makes an angle of 20 degrees below 
the horizontal? 

2.12 Imagine trying to turn the head of a bolt by pushing on the handle of a 
wrench. The vector torque exerted by the force you apply (F) is given by 
the equation f = r x F, where r is a vector from the point of rotation 
to the point of application of the force. If you push on the handle of the 
wrench with a force of 25 N at a distance of 12 cm from the point of 
rotation, in what direction should you push to maximize the torque on 
the bolt head? If you push in that direction, how much torque will you 
exert on the bolt head? 


3 

Vector applications 


The real value of understanding vectors and how to manipulate them becomes 
clear when you realize that your knowledge allows you to solve a variety of 
problems that would be much more difficult without vectors. In this chapter, 
you’ll find detailed explanations of four such problems: a mass sliding down 
an inclined plane, an object moving along a curved path, a charged particle 
in an electric field, and a charged particle in a magnetic field. To solve these 
problems, you’ll need many of the vector concepts and operations described in 
Chapters 1 and 2. 


3.1 Mass on an inclined plane 

Consider the delivery woman pushing a heavy box up the ramp to her delivery 
truck, as illustrated in Figure 3.1. In this situation, there are a number of forces 
acting on the box, so if you want to determine how the box will move, you need 
to know how to work with vectors. Specifically, to solve problems such as this, 
you can use vector addition to find the total force acting on the box, and then 
you can use Newton’s Second Law to relate that total force to the acceleration 
of the box. 

To understand how this works, imagine that the delivery woman slips off the 
side of the ramp, leaving the box free to slide down the ramp under the influ- 
ence of gravity. For starters, pretend that the ramp is so slippery that friction 
between the bottom of the box and the ramp surface is negligible (so the coef- 
ficient of friction is effectively zero). How fast will the box be moving when it 
reaches the bottom of the ramp? Perhaps more importantly, on what does that 
speed depend? 

Whenever you approach a problem like this, it’s a good idea to begin 
by drawing a diagram that shows all the forces acting on the box. Such a 
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Figure 3.1 The delivery-truck problem. 


F n 



Figure 3.2 Free-body diagram for mass on frictionless ramp. 


“free-body” diagram will help you determine the total force acting on the 
object, from which you can easily determine the object’s acceleration using 
Newton’s Second Law (a = EF/m). 1 And once you know the acceleration, 
it’s an easy matter to find the velocity. An example of the free-body diagram 
for this (frictionless) case is shown in Figure 3.2. 

By removing the delivery woman and friction from the problem, the only 
remaining forces acting on the box are the force of gravity F ? , which points 
vertically downward, 2 and the normal force F n , which is perpendicular (or 
“normal”) to the surface of the ramp. The origin of these forces is easy to 
understand; the gravitational force is produced by the mass of the Earth, and 
the normal force is produced by the ramp as a reaction to the force produced by 
the box on the ramp (if the ramp weren’t pushing upward on the box, gravity 
would cause the box to accelerate straight downward). 

1 You may be more accustomed to seeing this as F = ma. but the form shown above is meant to 
remind you that it’s the sum of the forces that produces acceleration, and the primary job of all 
mass is to resist acceleration (which is why mass lives in the denominator - if the same force is 
applied to a large mass and a small mass, the small mass experiences greater acceleration). 

- This ignores local gravitational anomalies, which is a very reasonable thing to do for problems 
of this type. 
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Figure 3.3 Free-body diagram with coordinate axes. 


Do these two forces really act only at a single point somewhere inside the 
box, as implied by Figure 3.2? Clearly not, since every particle in the box 
is being pulled downward by the Earth’s gravity, and the force of the ramp 
on the box occurs along the entire underside of the box. But to determine 
the acceleration of the box in this problem, you don’t need to worry about 
the actual point of application of the forces, because you can treat the box 
as a particle that exists at a single location. That’s not always the case; in 
problems involving torque and angular acceleration, for example, the point of 
application of the force may be critically important. But the box in this prob- 
lem is sliding, not rolling, down the ramp, and you’re perfectly justified in 
treating the box as a single particle and drawing the forces as though they 
all act at the same point. Furthermore, you’re less likely to make a mistake 
about the angles of the forces if you draw them as in Figure 3.2. This approach 
can be justified using the concept of center of mass (CM), since for a rigid 
object of mass m you can consider the entire object as a single point and write 
acM = FcM/m- 

Before doing the vector addition of the two forces acting on the box to deter- 
mine the total force, it’s a good idea to draw a set of coordinate axes onto your 
free-body diagram, as in Figure 3.3. Of course, you’re free to draw the axes in 
any direction you choose, but when you’re faced with a problem of a mass on 
an inclined plane, there are certain benefits to drawing the x-axis pointing 
down the ramp (and parallel to the ramp surface) and the y-axis pointing 
upward (and perpendicular to the ramp surface). This approach has the advan- 
tage that the normal force lies entirely along the positive y-axis, and the motion 
of the block sliding down the ramp is entirely in the positive x -direction (as 
long as the box stays on the ramp). To pay for that advantage, you’ll have 
to use a bit of geometry to find the x- and y-components of the gravitational 
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Figure 3.4 Geometry to find the angle of F g . 

force, since the vector F g points straight downward and is therefore aligned 
with neither the down-plane (x-) nor the perpendicular-to-plane ( _y - ) axis. ’ 

The key to finding the x-component (F g . x ) and the y-component ( f„ v ) of 
the gravitational force ( F g ) is to realize that the angle 9 between the ramp 
surface and the horizontal is also the angle between F g and the negative y-axis, 
as shown in Figure 3.4(a). 

If you’re uncertain why the two angles shown as 9 in Figure 3.4(a) must be 
the same, take a look at Figure 3.4(b). Completing the two triangles shown in 
Figure 3.4(b) should help you see that the angle between F g and the negative 
y-axis is indeed 9 (you may also be able to see this by imagining the case in 
which 9 = 0° or 9 = 90°). 

Once you’re convinced that the angle between F g and the negative y- 
axis is 9, it’s quite straightforward to determine F gx and F g y , the x- and 
y-components of the gravitational force vector F g . As you can see in 
Figure 3.5, the components of F g are given by 

F e x = \F„\sin9{i), 

j” J (3.1) 

F g,y = \F g \cos9(-j), 

where the minus sign before the j accounts for the fact that this component 
points in the negative y -direction. 

3 You may, of course, choose your axes to point exactly horizontally and vertically, in which case 
Fg would point entirely in the negative y-direction. In that case, the normal vector F n would 
have both x- and y-components. But since other forces (such as friction and the delivery 
woman’s push) generally point along the ramp surface, tilting your coordinate axes may well 
save you time later. 
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Figure 3.5 x- and y-components of F g . 


A note about notation: as mentioned in Chapter 1, it’s customary to write 
Eqs. 3.1 as 


Fg, x = \Fg\sinO, 
Fgy = —\Fg\cos6, 


(3-2) 


that is, as scalars rather than vectors. That’s because the direction of vector 
components should be clear from the subscript: the x-component is always 
in the i direction (or —i direction if it’s negative), and the y-component is 
always in the j direction (or —j direction if it’s negative). So you can write 
the components of a vector as scalars or vectors, as long as you remember that 
each component points in a specific direction, which means you cannot simply 
add the x- and y-components algebraically, even if they’re written as scalars. 
You must add them as vectors. 

Whether you write the components as vectors or scalars, having the x- and 
y-components of F g in hand and knowing that the normal force of the plane on 
the box is entirely in the positive y-direction, you’re now in a position to use 
vector addition to find the total force acting on the box. Writing the magnitude 
of the sum of the forces in the x-direction, you have 


IEEjcI = \Fg\sinO, 


(3.3) 


and in the y-direction 

\VF y \ = {\F n \-\F g \cose). (3.4) 

Alternatively, instead of writing separate equations for the x- and 
y-components of the total force, you can write a vector equation incorporating 
both: 

YF = (| F g \sinO)i + (\F n \ - \Fg\cosd)j , 


(3.5) 
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which contains exactly the same information as Eqs. 3.3 and 3.4. 

Getting from the total force to the acceleration of the box is a simple step 
thanks to Isaac Newton, whose Second Law tells you that the magnitudes of 
the x- and y-compcnents of the acceleration are 

a x — \HF x \/m = {\Fg\sinO /m), (3.6) 

and 

a y = \ YiF y \/m = [(|F„| - \F g \cos9)/m], (3.7) 

or, in full vector form, 

a — E F/m = ( \F g \sinO/m)i + [(|F„| — \F g \cos9) / m]j . (3.8) 

Whether you realize it or not, you almost certainly know two facts that will 
allow you to simplify these equations considerably. The first is that the magni- 
tude of the force of gravity (|F ? |) on an object of mass “m” is simply equal to 
mg, where “g” is the magnitude of the acceleration of gravity (9.8 m/s 2 at the 
Earth’s surface). 4 So wherever you have the factor \F g \, you can substitute the 
expression mg. 

The second simplification is produced by the realization that as long as the 
box stays on the ramp and doesn’t fly off into the air or break through to the 
ground, the y-component of the acceleration (a y ) must remain zero (remember 
that the y-axis is perpendicular to the surface of the ramp). Using the fact that 
/-„ | = mg and that a y — 0 turns Eqs. 3.6 and 3.7 into the following: 


a x = mg sin 9/m = g sin 9 

(3.9) 

, = ( \F„ — mg cos 9)/m = 0. 

(3.10) 


When you’re working a physics problem, it’s a good idea to step back from 
your calculations once in a while to look at your intermediate results to see 
if they’re trying to tell you something - and that’s certainly the case at this 
point. Equation 3.9 already has an important result for you: in the absence of 
the upward-pushing delivery woman and with no friction, the box will accel- 
erate down the ramp (that is, in the +x -direction) with an acceleration that 
depends on only two things: which planet the delivery truck is on (that is, the 
value of “g”) and the angle that the ramp makes with the horizontal (9). In this 

4 Remember that mass is a measure of the amount of material an object contains and weight is 
the force of gravity on that mass. So mass is a scalar (magnitude only) and weight is a vector 
(magnitude = mg and direction = straight down). Should you travel in space, your weight will 
change as you leave the Earth's gravity behind, but your mass will remain the same. 
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case, just as for a freely falling object, the mass of the box does not affect its 
acceleration.' 1 

Since the sine of the ramp angle can never be greater than one, Eq. 3.9 also 
tells you that the magnitude of the acceleration of the object (g sin 9) can never 
be greater than g, the accleration of gravity. It can, of course, be equal to g if 
sin 9 = 1 . But this would mean that 9 would have to be 90° (since sin 90° =1), 
in which case the ramp would be exactly vertical. In such cases, you no longer 
have an object sliding down a ramp, you have an object falling next to a wall. 

There’s also good information lurking in Eq. 3.10, but you have to think a bit 
to see it. According to this equation, the y-component of the box’s acceleration 
is equal to the difference between the magnitude of the normal force (]F„|) and 
the y-component of the gravitational force (mg cos 9). But since you know that 
in this problem the box remains on the ramp and the y-acceleration is therefore 
zero, you can use Eq. 3.10 to determine the magnitude of the normal force. 
Since 

a y = (|F„| — mg cos 9) / m = 0, 

then 

|F„| = mg cos0. (3.11) 

So the normal force depends on the weight of the object (mg) and the cosine 
of the ramp angle (9). Understanding this will help you avoid a common pit- 
fall for students who know that the normal force is the reaction force produced 
on the object by the ramp, and who then mistakenly conclude that the normal 
force must always equal the weight of the object (mg). That line of reason- 
ing only works for horizontal surfaces, because for any inclined surface, it’s 
only the component of the object’s weight that’s perpendicular to the surface 
that produces the reaction force we call the normal force. That perpendicu- 
lar component of the object’s weight is shown in Figure 3.5 to be mg cos 9, 
which spans the range from mg (when 0 = 0°, meaning the ramp is horizontal 
and bears the full weight of the object) to zero (when 9 = 90°, meaning the 
ramp is vertical and bears none of the object’s weight). In all other cases, the 
magnitude of the normal force will have a value between 0 and mg. 

If you’re wondering why you should bother finding F n if you’re only inter- 
ested in the x-component of the acceleration, the answer is that you may not 

5 But doesn’t the Earth pull harder on a more-massive object? Yes it does, but a more-massive 
object also resists acceleration more than a less-massive object. Since gravitational mass 
(which determines how strongly gravity pulls on an object) has the same value as inertial mass 
(which determines how strongly the object resists acceleration), the result is that all objects fall 
freely (or slide freely down frictionless ramps) with an acceleration that does not depend on 
their mass. 
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care about F n for the frictionless case (unless you’re worried about your ramp 
breaking), but you’ll definitely need F n when friction exists between the ramp 
surface and the bottom of the box. 

With the magnitude of the down-ramp component of the acceleration (a x ) 
available from Eq. 3.9, all that remains is for you to find the speed of the 
box at the bottom of the ramp. Finding speed from acceleration turns out to 
be quite straightforward, especially when the acceleration is constant (as it is 
in this case), provided that you’re in possession of either one of two pieces 
of information: the time the box takes to reach the bottom of the ramp, or 
(more likely), the distance from the box’s starting point to the bottom of the 
ramp. You’ll also need the initial speed, which you can generally discern from 
the initial conditions, and which you can take to be zero in this case. As you 
may remember from kinematics, the final speed of an object moving in the 
x-direction with initial speed v x , initial undergoing constant accleration a x over 
time t is given by 

Vx, final = V x ,initial a x t , (3.12) 

or, if you know d, the distance in the positive x-direction over which the 
acceleration occurs, 

(v x , final) = ( V x , initial) - "b 2 a x d. (3.13) 

Using the expression for acceleration from Eq. 3.9, this becomes 
(v x , final) 2 = (0) 2 + 2 (g sin 6) d 


or 

V x , final = s/2 ( g sin 0)d. (3.14) 

So, for example, a box sliding down a 2 m ramp with an angle of 30° to the 
horizontal on the surface of the Earth will be moving at a speed of 

v x , final = y^2 [(9.8m/s 2 ) sin30°]2m = 4.4m/s (3.15) 

when it reaches the bottom of the ramp. If you’re curious about how long it 
takes the box to travel the 2 m down the ramp under these conditions, you can 
plug this value for the final speed into Eq. 3.12 and solve for t, which turns out 
to be about 0.9 s in this case. 

Stripping away effects such as friction is often a good way to learn the fun- 
damentals of a problem, but if you’ve ever encountered a ramp outside of 
physics texts, there’s a good chance you had to deal with friction. Happily, 
once you understand how to use vectors, including friction in the “box on a 
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ramp” problem becomes a simple matter of adding another force into the mix 
before solving for the acceleration. 

As you may recall, friction operates in two regimes: “static” friction deter- 
mines how hard you have to push on a stationary object to get it moving, but 
once the object is moving, the frictional force that opposes the motion is pro- 
duced by “kinetic” friction. So although both types of friction oppose motion, 
the magnitude of the force produced by static friction depends on the applied 
force (the harder you push, the stronger the opposing force of static friction, 
until the object “breaks free” and begins moving), while the magnitude of the 
kinetic-friction force depends only on the normal force and the coefficient of 
kinetic friction between the object and the surface. 6 To determine the effect 
of kinetic friction on the speed of the box at the bottom of the ramp, you can 
modify your free-body diagram to include the frictional force (F f), as shown 
in Figure 3.6. 

Notice that the direction of the frictional force is chosen so as to oppose the 
motion, and since the box is moving down the ramp in this case, the force of 
kinetic friction points up the ramp (in the negative x -direction). 

To determine the effect of friction on the acceleration of the box sliding 
down the ramp, you simply have to include the frictional force (F f) in your 
equation for the sum of the forces in the x-direction (Eq. 3.3), which becomes 

\'£F x \ = \F g \sinO-\F f \. (3.16) 

This makes the acceleration 

a x — YiF x lm = (\F g \sind — /m. (3.17) 

Clearly, to determine the magnitude of the acceleration (a x ), you’ll need to 
find an expression for \Ff\, just as you used mg sin0 for \F gx \ in Eq. 3.9. 



Figure 3.6 Free-body diagram for object on ramp with friction. 

6 You can read more about this in introductory physics texts such as Serway & Jewett or 
Halliday, Resnick, & Walker. 
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Fortunately, that’s easy to do, because the magnitude of the force of kinetic 
friction is simply the product of the magnitude of the normal force (|F„|) and 
the coefficient of kinetic friction (pk)- 

\Ff\=n k \F n \. (3.18) 

You also know from Eq. 3.11 that \ F n \ — mg cos 0, so 

a x = (mg sin0 — poking cos 6) /m 

= (g sin# — pkg cost)) . (3.19) 

Comparing this expression for the acceleration of the box to the acceleration 
in the frictionless case (Eq. 3.9), you’ll be happy to note that the term due to 
gravity (g sin 0) is exactly the same in both cases, and the term due to friction 
(Hk g cos 0) is subtracted from the gravity term. This means that the acceler- 
ation of the box will be made smaller by the frictional force. So in the case 
considered previously of a box sliding down a 2 m ramp that makes an angle 
of 30° with the horizontal, if the coefficient of kinetic friction between the box 
and the ramp is 0.4, the speed of the box at the bottom of the ramp will be 
reduced to 

v x , final = ^2 [(9.8m/s 2 ) sin30° — (0.4)(9.8m/s 2 ) cos30°]2m 

= 2.5 m/s. (3.20) 

There is one aspect of Eq. 3.19 that may worry you: what if the second term 
is larger than the first? For any angle between 0° and 45°, the cosine is bigger 
than the sine, so if the coefficient of kinetic friction (fik) is sufficiently large, 
this equation predicts that the acceleration will be in the negative x-direction, 
meaning the box will acclerate up the ramp even if no one is pushing on it. 
As physicists like to say, “That’s not physical,” meaning that this result contra- 
dicts other well-established laws of physics (conservation of energy comes to 
mind in this case). So where have we gone wrong in our analysis? We haven’t, 
really, you just need to think carefully about the initial assumptions. One of 
those assumptions was that the box is travelling down the ramp, which is why 
we drew the frictional force pointing up the ramp in our free-body diagram 
(Figure 3.6). But if the ramp isn’t very steep and the coefficient of friction 
between the box and the ramp is sufficiently large, the down-ramp component 
of the force of gravity will not be strong enough to overcome the frictional 
force, and the box will not slide down the ramp. So there’s nothing wrong 

7 You can determine whether the box will move by comparing the maximum static frictional 
force (which is just the product of the coefficient of static friction and the normal force) to the 
sum of the x -components of all the other forces. 
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with Eq. 3.19, it’s just that it only applies to the situation in which the box is 
moving down the ramp under the influence of gravity, in which case the force 
of kinetic friction points up the ramp. 

So there you have it. You’ve used vectors to represent the forces of gravity 
and friction, and knowing how to find vector components and how to perform 
vector addition has allowed you to find the acceleration and speed of the box 
under various conditions. And if a box sliding straight down a ramp is a bit too 
mundane for your taste, you may want to take a look at the next three appli- 
cation examples. In them, you’ll see how vectors can be helpful in analyzing 
motion on a curved path and how vector operations can be used to understand 
the behavior of electric and magnetic fields. 


3.2 Curvilinear motion 

In everyday language, the word “acceleration” is used as a synonym for 
“increasing speed.” Hence the “accelerator” in an automobile usually refers 
to the gas pedal. But in physics and engineering, acceleration is defined as any 
change in velocity, and velocity is a vector quantity with both magnitude and 
direction. So changing the direction of the velocity is also a form of accelera- 
tion, meaning that most cars have three accelerators: the gas pedal, the brake, 
and the steering wheel. “Stepping on the gas” produces an acceleration in the 
same direction as the velocity vector (causing the speed to increase), press- 
ing on the brake produces an acceleration directly opposite to the direction of 
the velocity vector (causing the speed to decrease), and turning the steering 
wheel produces an acceleration perpendicular to the velocity vector (causing 
the car’s direction to change but not affecting the speed). 8 Acceleration in the 
direction parallel (or antiparallel) to the velocity vector is called “tangential” 
and acceleration perpendicular to the velocity is called “radial.” Any time an 
object experiences radial acceleration, it does not move in a straight line, and 
its motion is called “curvilinear.” An example of curvilinear motion is shown 
in Figure 3.7, in which a car is going around a curve. 

Note that at any instant, the velocity vector points directly along the path 
the car is following. For a curving path, that means the instantaneous veloc- 
ity vector is tangent to the path, as you can see when the car is at position B 
in Figure 3.7. If you wish to determine the acceleration at points such as A, 
B, and C along the car’s path, it’s not enough to know the velocity at those 
points; you have to know how the velocity is changing with time at those 
locations. 

8 In reality, turning the steering wheel produces frictional forces that also slow the car down, but 
it’s the perpendicular component of the acceleration that causes the car to turn. 
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Figure 3.7 Velocity vectors for a car following a curved path. 


A good way to visualize the acceleration vector is to graphically represent 
the velocity vector at the instants of time just before and just after the car is at 
positions A, B, and C. This is illustrated in Figure 3.8 for the following case: 
the car is slowing down at Position A as it approaches the turn, maintaining 
constant speed while turning at Position B, and then speeding up as it exits the 
turn at Position C. 

You can get a sense of the acceleration just by examining the change in the 
velocity vectors at each position. Comparing the velocity vectors just before 
and just after Position A, you can see that the magnitude (length) of the vector 
is getting smaller but the direction remains the same. This means that the speed 
of the car is decreasing but the car is not yet turning. Now look at the velocity 
vectors just before and after Position B: the direction of the vector is changing 
but its length is not, so the car is turning while maintaining constant speed. 
Finally, by examining the velocity vectors before and after Position C, you can 
see that the length is increasing, meaning the car is speeding up after leaving 
the turn. 

The direction of the acceleration is easily found by remembering that the 
average acceleration is given by the equation a = Av/ At, where Av is the 
change in velocity over time At. That change in velocity is just v final — ^initial, 
which you can determine by subtracting the earlier velocity vector from the 
later one at each position in Figure 3.8. To make that easier, the vectors are 
reproduced in Figure 3.9. 
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Figure 3.8 Change in car’s velocity vectors at Positions A. B, and C. 



Figure 3.9 Velocity vectors before and after Positions A, B, and C. 


Note that the vectors shown in Figure 3.9 include not only v fi na i and i '’initial, 
but also the negative of Vinitial- That’s because you’ll need to know —u, 
to compute the change in velocity, since Av = v final — Vinitial, which is the 
same as v final + (— Vinitial )■ Remember that to add two vectors graphically 
you simply move the tail of one to the head of the other and then draw the 
resultant from the start of the first to the end of the second vector. The results 
of adding vectors v final and — v initial are shown in Figure 3.10. 

In Figure 3.10, the velocity vectors — Vinitial and v fi na i for Positions A and 
C are shown slightly offset since they would overlay one another if they were 
drawn truly head-to-tail. If you look at the direction of the vector representing 
the change in velocity (An) at each position, you’ll see that while the car is 
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V final 


V initial 




Av = V final- V initial 
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AV = V final- V initial 


Av = V final— V initial 
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Figure 3.10 Change in velocity vectors at Positions A, B, and C. 

slowing down at Position A, the change in velocity is in the opposite direction 
from the velocity at this point. Since the acceleration (a) is defined as the vector 
change in velocity (Av) divided by the scalar time period (At) over which that 
change occurs, the direction of a must be the same as the direction of At). 
Hence the acceleration direction at Position A is opposite to the direction of 
the velocity vector, as you’d expect when the car is slowing down. This is an 
example of negative tangential acceleration. 

Now consider the direction of the vector change in velocity An at Position B, 
where the car is going around the turn at constant speed. In this case, subtract- 
ing Vinirial from v final gives a vector Av that is perpendicular to the velocity 
vector. This shows that the acceleration vector for an object moving along a 
curve at constant speed points toward the center of curvature (to help you visu- 
alize this direction, the Av vectors are shown on the car’s path in Figure 3.11). 
At position B, this is an example of radial acceleration. 9 

Finally, as the car speeds up at Position C, you can see that the direction 
of the vector change in velocity A? is the same as the direction of the veloc- 
ity vector, meaning that the accleration in this case is parallel to the velocity. 
Hence this is an example of positive tangential acceleration. 

For Position B, a careful analysis of the length of the vector change in veloc- 
ity reveals that the magnitude of the radial acceleration depends on the square 
of the speed and on the radius of curvature of the path. Before getting into 
that, it’s worth a few minutes of your time to make sure you understand the 
terminology commonly used to describe acceleration and force in curvilinear 
motion. Acceleration toward the center of curvature (such as the acceleration at 
Position B in Figure 3.1 1) is called “centripetal” (for “center-seeking”) accel- 
eration, and the force producing that acceleration is often called centripetal 
force. It’s important for you to understand that a centripetal force is not a new 

9 As described later in this section, most texts define the positive direction for radial acceleration 
to be outward from the center of curvature, in which case the acceleration at Point B would be 
considered negative radial acceleration. 
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Figure 3.11 Acceleration vectors at Positions A. B. and C. 

kind of force that is somehow different from mechanical, electrical, magnetic, 
or other kinds of force. The word “centripetal” simply describes the direction 
of the force, but the force itself is provided by the same old kinds of forces to 
which you’re accustomed. So for a car going around a curve, the centripetal 
force is simply the frictional force of the tires on the ground. If you tie a rock 
to a rope and twirl the rope in a circle, the centripetal force on the rock is 
produced by the tension of the rope. And if you fill a bucket with water and 
swing it over your head, the centripetal force on the bucket (and via the bucket 
on the water) comes from the muscles in your arm. So the centripetal force is 
whatever force is producing the centripetal acceleration that causes the object 
to follow a curved path. 

As footnoted earlier, it’s conventional to consider radial acceleration ( a r ) 
as positive outward {away from the center of curvature), and since centripetal 
acceleration ( a c ) is defined as positive toward the center of curvature, you may 
run across an equation such as a r = —a c . This is simply a statement that the 
radial acceleration and centripetal acceleration are commonly defined to have 
the same magnitude but opposite directions. 

You should note that in the case of the car on the curving road, the rock being 
twirled in a circle on a rope, and the bucket of water being swung over your 
head, the centripetal acceleration (and hence the centripetal force) is toward the 
center of curvature, and there is no acceleration (and no force) pointing radially 
outward. But what about the “centrifugal” force that the occupants of the car 
feel toward the outside of the curve (that is, toward the left door if the car is 
turning to the right)? What they’re feeling is the force of the left door on their 
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bodies as they attempt to obey Newton’s First Law and continue moving in a 
straight line while the car is accelerating to the right. So centrifugal force is the 
apparent force experienced by observers in the reference frame that is rotating 
with the object (physicists refer to acclerating reference frames such as this 
as “non-inertial”). Hence if you’re riding in a right-turning car, as you slide 
across the seat and up against the left door, in your (rotating) reference frame 
you’re accelerating to your left, which causes you to conclude that there’s a 
force in that direction (outward from the center of curvature). But for those 
of us not riding in the car, we don’t see any such force; we simply observe 
the centripetal acceleration of the car as the friction of the tires on the road 
provides a centripetal (rightward) force. 

The concept of centripetal and centrifugal force can be understood by con- 
sidering an Olympic hammer-thrower as she spins a heavy mass on the end 
of cable, as illustrated from above in Figure 3.12. For the thrower, it feels 
like the object is pulling directly outward (away from her). Once again, in 
the non-rotating reference frame of the stadium, that’s just because the object 
is attempting to obey Newton’s First Law and continue moving in a straight 
line. So from our vantage point in the viewing stand, we see that the hammer- 
thrower is having to produce a centripetal (radially inward) force to make the 
object follow a curved path. 

So is the hammer-thrower wrong in her assessment? Absolutely not. In her 
reference frame, which is rotating along with the mass, her conclusion that 
a radially outward (centrifugal) force exists is perfectly valid. After all, she 
knows that she has to exert a very strong inward force on the cable to keep the 
mass at the same distance from her (because in her reference frame the mass 


Mass 


/ 



\ 
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Figure 3.12 Top view of hammer- thrower. 
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has zero acceleration until she releases it). Hence she is correct in concluding 
that in her reference frame there must be a force in the radially outward direc- 
tion to balance her inward pull. So if you hear someone say that the centrifugal 
force is “fictitious,” they generally mean that centrifugal force is an apparent 
force to an observer in a rotating (non-inertial) reference frame. 

Once you understand the concepts of centripetal acceleration and force, it’s 
reasonable to ask how strong the centripetal force must be to cause an object to 
follow a given path. It’s simple to determine the centripetal force using New- 
ton’s Second Law (F = ma) if you know the object’s mass and have some 
way of finding the centripetal acceleration. Happily, the centripetal accelera- 
tion turns out to depend only on the object’s speed and the radius of curvature 
of the path, as you can see by considering Figures 3.13 and 3.14. 

In Figure 3.13 you can see the velocity vectors at two locations for an object 
in uniform circular motion (meaning that the object’s speed and the radius of 
curvature are both constant over the time period under consideration). Note 
that the two positions are separated by angle A 0 at the center of curvature, 
which makes the arc length between the initial and final positions equal to 
r Ad, where r is the radius of curvature and A 0 is in radians. Since the speed 
of the object is constant over this distance, you know that \vinitial\ must equal 
I v final I (in other words, the direction but not the length of the velocity vector 



Figure 3.13 Geometry of changing direction of velocity. 


lensth = lAvl 


length = | v final | = Ivi 



length = I viA6> 


Figure 3.14 Geometry for determining length of An. 
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has changed). You can therefore set \ vinitial\ = \v final] = |5|, where |u| is the 
speed of the object at both positions. Since the average speed of the object is 
defined as the distance covered divided by the time taken to cover that distance, 
you can write 


\v\ 


r Ad 
At ’ 


(3.21) 


which means that 



(3.22) 


The reason that an expression such as Eq. 3.22 for Ad is valuable is that this 
angle change is directly related to the magnitude of the vector change in veloc- 
ity, which you need to know if you want to find the centripetal acceleration. 
To see that, consider what happens if you form the vector Av by adding v fi, la i 
to — Vinitial , as in Figure 3.14. The first thing you should note is that the angle 
between the vectors v fi„ a i and —Vinitial is equal to A 6 (if you don’t see why 
that’s true, go back to Figure 3.13 and imagine extending both vectors v final 
and —Vinitial until they cross). Also note that the vector An is drawn at the 
location mid-way between the original location of Vinitial and the original 
location of Vfi, m i, since that’s the location at which you’re finding the cen- 
tripetal acceleration. The final thing to note in this figure is that both v fi na i 
and —Vinitial have length equal to |t>|, which makes the arc length shown in the 
figure equal to |u| Ad. 

Now imagine what will happen if you allow the angle Ad to shrink toward 
zero. As the angle decreases, the arc length |5| Ad will get closer and closer to 
the length of Av. Plugging in the value for Ad from Eq. 3.22, you have in the 
small-angle limit 


_ InlAr 

\Av\ « |5|A6> = |5|— — 
r 

\v\ 2 At 


(3.23) 


which means that the magnitude of the instantaneous centripetal acceleration is 


|Ai5| |n| 2 Af 


At 

1 7 ! 1 2 


r At 


(3.24) 


So there you have it: the centripetal acceleration at any given point is simply 
the square of the speed divided by the radius of curvature of the path at that 
point. Hence doubling your speed means that your centripetal acceleration is 
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four times larger, which means that the centripetal force must be four times 
stronger. 

If you’re concerned that Eq. 3.24 may apply only in the case of uniform 
circular motion, remember that by allowing A 0 to become arbitrarily small 
you’ve ensured that neither the speed nor the radius of curvature has changed 
during the time period under consideration. 

What does Eq. 3.24 tell you about the amount of force needed to cause 
an object to follow a specified curving path? Consider the hammer-thrower 
discussed above and shown in Figure 3.12, and assume that she intends to 
launch a 4 kg mass at the end of a 1.2 m cable with a speed of 20 m/s. Assuming 
she achieves her maximum speed just before letting go of the cable, at that 
point the centripetal accleration will be 

_ |u| I 2 (20m/s) 2 

\°c\ = = — p, 

r 1.2m 

= 333.3m/s 2 , 

which means that the thrower must provide a centripetal force of 

|F C | = m\a c \ =4kg (333.3m/s 2 ) 

= 1333.3N 

which is almost 300 pounds of force (and this doesn’t include the mass of the 
cable). 

With Eq. 3.24 to help you find the magnitude of the centripetal accelera- 
tion, and knowing that the tangential acceleration is just the change in speed 
over time (a tang = Av/At), the total acceleration can be found through 
vector addition, as shown in Figure 3.15. Thus the magnitude of the total 
acceleration is 



I 

Figure 3.15 Total acceleration as the vector sum of centripetal and tangential 
acceleration. 
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|A5|\ 2 


(3.25) 


You’ll find an example of combined tangential and centripetal acceleration 
in the problems at the end of this chapter. 


3.3 The electric field 


If the previous two sections convinced you that vectors are very helpful in 
solving mechanics problems, the next two sections should help you under- 
stand why vectors are absolutely essential in problems involving electric and 
magnetic fields and their effect on charged particles. You’ll also see how the 
vector operations of divergence, curl, gradient, and Laplacian are used in elec- 
trostatics. Even if you’ve never taken an E&M course (and never hope to), the 
examples in these sections should be sufficiently self-contained to allow you 
to understand how vectors and vector operations can be used in E&M. 

The natural way to begin a discussion of electric and magnetic fields is to 
provide a clear, concise definition that states exactly what an electric or mag- 
netic field is. Such a definition would appear right here if I had one. But almost 
two centuries after Michael Faraday first used the words “field of force” to 
describe the region around electric charges, we still don’t have a standard way 
of saying what such a field is. The Oxford English Dictionary provides def- 
initions for “field” that include an “area or space” under the influence of an 
agent, a “state or situation” in which force is exerted, and the “action” of a 
force. According to James Clerk Maxwell, “The electric field is the portion of 
space in the neighbourhood of electrified bodies.” In Halliday, Resnick, and 
Walker you can learn to define the electric field by placing a small positive test 
charge qo at some point and measuring the electrostatic force Fe on that test 
charge; 1 ' the electric field E is then defined as E — FE/qo ■ In Griffiths’ Intro- 
duction to Electrodynamics, he states that “. . .physically, E(P) is the force 
per unit charge that would be exerted on a test charge placed at P.” The words 
“would be” in that definition are important, because it is not necessary for the 
test charge to be present in order for the field to exist. 

10 Why do physics and engineering texts always refer to a small test charge? For two reasons: 
firstly, the amount of charge on the test charge must be small so that the electric field produced 
by the test charge is negligible when compared to the electric field that you’re trying to 
determine using the test charge. Secondly, the test charge must be physically small because 
you’re using it to determine the field at a specific position, so you don't want your test charge 
to extend over a large region of space. 
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The common thread running through all these definitions is this: fields and 
forces are closely related. So we’ll take the following as our definition of the 
electric field E : 



where E is the vector electric field, qo is a small test charge, and Ef is the 
electric force produced on the test charge by the electric field. Defining the 
electric field through this equation should help you remember that £ is a 
vector quantity with magnitude directly proportional to force and with direc- 
tion given by the direction of the force on a positive test charge (because 
if qo is negative, there would be a minus sign on one side of the equation, 
which would mean that vector E would be in the opposite direction from 
vector Ff). 

This definition should also help you see that E has dimensions of force 
divided by charge, for which the standard (SI) units are newtons per coulomb 
(N/C). These units are equivalent to volts per meter (V/m), since volts have 
dimensions of force times distance divided by charge (units of newtons times 
meters/coulombs). So you’ll find the units of electric field given as N/C in 
some texts and V/m in others, and you can rest assured that these mean exactly 
the same thing. 

There is, however, something important to be noticed in the units of the elec- 
tric field vector: the dimension of length (units of meters in this case) appears 
in the denominator of the dimensions of the electric field. And that means that 
the vector that represents an electric field has a fundamental difference from 
the vectors that represent quantities such as position (which has dimension of 
length), velocity (dimension of length over time), or acceleration (dimension 
of length over time squared). As you can read in Chapter 4, that’s because vec- 
tors whose dimensions contain length in the numerator transform oppositely to 
vectors whose dimensions have length in the denominator when you perform 
certain coordinate-system changes. If this seems unclear and you don’t plan to 
venture into the tensor portion of this book, do not panic; none of this will pre- 
vent you from using the concepts and operations described in Chapters 1 and 2 
to solve problems involving vectors of this kind, exactly as you’re about to do 
in the remainder of this section. But if you’ve run across objects called “one- 
forms” or “covectors” (of which the electric field is an example) and you’re 
wondering how those objects are different from the things you’ve been call- 
ing vectors, the appearance of length in the denominator of the dimension is 
the beginning of the answer (you’ll find the rest of the answer in Chapter 4 if 
you’re interested). 


3.3 The electric field 


83 


You should also make sure you understand that if you know the electric field 
£ at a given location, placing any amount of charge q at that location will result 
in an electric force Fe given by 


(3.27) 


F e = qE. 


So while Eq. 3.26 uses the electric force on a positive test charge to define the 
electric field, Eq. 3.27 is a generally useful expression for finding the electric 
force on any amount of charge at the location for which the electric field is 
known. 

Defining an electric field is useful, but exactly how would you go about pro- 
ducing an electric field? One way is to gather up some electric charge, because 
every bit of charge produces an electric field, just as every bit of mass produces 
a gravitational field. Electric fields can also be produced by changing magnetic 
fields, but it is the “electrostatic” field produced by stationary electric charge 
that will be used to demonstrate the application of vectors in this section. 

It’s often helpful to be able to visualize the electric field in the vicinity of 
a charged object. The most common approaches to constructing a visual rep- 
resentation of an electric field are to use either arrows or “field lines” which 
point in the direction of the field at each point in space. In the arrow approach, 
the strength of the field is indicated by the length of the arrow, while in the 
field-line approach, it’s the density of the lines that tells you the field strength, 
with closer lines signifying a stronger field. When you look at a drawing of 
electric field lines or arrows, be sure to remember that the field exists between 
the lines as well. 

The electric fields produced by positive and negative point charges are 
shown using the arrow approach in Figure 3.16 and using the field-line 
approach in Figure 3.17. When you look at electric field lines such as these, 



(a) 


(b) 


Figure 3.16 The electric field of positive and negative point charges drawn 
using arrows. 
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(a) (b) 

Figure 3.17 The electric field of positive and negative point charges drawn 
using field lines. 


don’t forget that the field arrows and lines always point in the direction of 
the electric force on a positive test charge, and that electrostatic field lines 
always begin on positive charge and end on negative charge. And since the 
field lines show the direction of the electric field at any given point, it’s impos- 
sible for two fields lines to cross, since that would indicate that the electric 
field is pointing in more than one direction at the point of intersection (if two 
electric fields are superimposed at a given point, they simply add as vectors to 
give the total electric field at that point, and that total field can only point in a 
single direction). 

At this point, you should make sure that you understand that electric fields 
can both be produced by electric charge as well as produce a force on another 
electric charge. So you’re likely to face problems in which you first have to 
determine the total electric field produced by charge at a certain location and 
then figure out the effect of that field on a completely different charge (not one 
of the charges producing the field). But doesn’t the charge that’s being affected 
(let’s call that one the “subject charge”) also produce its own electric field? Yes 
it does, but as long as the electric field produced by the subject charge isn’t 
strong enough to cause the other charges to move around, you can approach 
problems like this by finding the total electric field produced by all the other 
charges and then using that field to determine the force on the subject charge. 
This approach is very much like finding the Earth’s gravitational field at some 
point in space and then using that field to figure out the gravitational force on 
an object of known mass at that location, without considering what effect the 
mass of the object might have on the Earth. 

Problems like this are especially straightforward if the electric field is being 
produced by one or more discrete point charges. That’s because the electric 
field £ of a point charge q is simply 
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(3.28) 


where k e is the Coulomb constant (8.99 x 10 9 Nm 2 /C 2 ), r is the distance in 
meters from the point charge to the location at which the electric field is being 
determined, and r is a unit vector pointing radially outward from the point 
charge. 

Thus a single proton (electric charge of 1.6 x 10“ 19 C) at a distance of one 
meter produces an electric field given by 



= 1.45 x 10 -9 (N/C)r . 


Note that the direction of that field is radially away from the proton, since 
the unit vector r always points radially outward from the origin. An electron, 
having negative charge, produces an electric field of the same magnitude as 
that of the proton, but the electron’s electric field points toward the electron. 
To see that, note that when you plug in a negative charge for q in Eq. 3.28, you 
have 



1.45 x 10 -9 (N/C)r = 1.45 x 10“ 9 (N/C) (-?), 


where the minus sign tells you that the direction of the electron’s electric field 
is in the negative r direction, which is toward the source charge (since r is 
always radially outward, minus r is always radially inward). This is consistent 
with electric field lines beginning on positive charge and ending on negative 
charge. 

To understand how to add the vector electric fields, consider the situation 
shown in Figure 3.18. Note that q i is positive, so its electric field must point 
radially outward from the location of q\, while q 2 and qs are negative, so their 
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(-5, +4) cm , 


electron (+7, +2) cm 

O (0,0) cm 


^3 = -8 nC 


q~, = -6 nC 


(-5,-4) cm 


Figure 3.18 Example values for charges near an electron. 
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Figure 3.19 The electric fields produced by charges q\,q 2 , and 173 . 


electric fields must point radially inward toward their locations. To find the 
total electric field at the position of the electron, it may help you to picture the 
fields produced by q\, q 2 , and <73 as shown in Figure 3.19. 

If you read the discussion of field lines earlier in this section, you should 
realize that the electric field exists between the lines as well as at the locations 
of the lines themselves. But just to help you visualize the direction of the fields 
from each of the three charges, the field lines in Figure 3.19 have been drawn 
on a tilt so that they are directly in line with the location at which you’re trying 
to find the total field (the origin in this case). You should also remember that 
just because the lines have grown too small to see does not mean that the field 
has gone to zero. Hence the electric field produced by q\ points down and to 
the right at the location of the electron, the field from <72 points down and to 
the left, and the field from <73 points up and to the right. It is these three vector 
fields that you will have to add together to determine the total electric field at 
the point of interest. 

Using Eq. 3.28, the electric fields due to the three point charges q \ , < 72 , and 
<:/3 may be written as 

- q\ „ 

Ei — k e \n, 

r i 

E 2 = ke^r 2 , 

r 2 

E 3 = k e — r 2 . 
ri 


(3.29) 
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Of course, you know from Figure 3.19 that these three electric fields do not 
point in the same direction. That’s because the unit vector r\ points radially 
outward from the location of charge q\, and ?2 and rg, point radially outward 
from q 2 and < 73 , respectively. This means you can’t add the three electric fields 
algebraically; to find the total held you must use vector addition. You’ll find 
an example of the vector addition of electric fields in the problems at the end 
of this chapter and the on-line solutions. 

As you might suspect, it’s not just the simple operations of vector addition 
and multiplication by a scalar that find use in electrostatics. If you followed 
the discussion of the divergence operation in Chapter 2, you may be wonder- 
ing about the divergence of the electrostatic fields produced by a point charge 
(Figures 3.16 and 3.17). In fact, one of the fundamental laws of electrostatics 
is Gauss’s Law for electric fields, the differential form of which is 

Vo E = p/e 0 , (3.30) 


where p represents the volume electric charge density (coulombs per cubic 
meter) and €q is the vacuum permittivity of free space (8.85 x 10 “ 12 Nnr/C 2 ). 

Gauss’s Law for electric fields tells you that electric held lines diverge from 
any location at which positive charge exists (positive p) and converge upon 
any location at which negative charge is present (negative p). This explains the 
analogy between the “how” of electrostatic held lines and the how of a fluid. 
In this analogy, positive charge acts as the “source” of electrostatic held lines 
in the same sense as a faucet acts as the source of fluid, and negative charge 
acts as a “sink” of electrostatic held lines just as a drain does for huid. 

Note what happens when you take the divergence of the electric held of a 
point charge (this is most easily done in spherical coordinates); 


-> - 1 9 2 19 

V°£= 1 -(r£ r ) = 1 - 
r z or r L or 




1 9 

r 2 dr 


C k e q ) = 0. 


This is consistent with the worked example in Chapter 2 showing that the 
divergence of any radial vector held is zero if the amplitude of the held falls 
off as 1 / r 2 . Zero, that is, at all locations except where r = 0, the location of 
the source of the held. Thus Gauss’s Law tells you that electrostatic held lines 
diverge only from those locations at which positive electric charge exists, and 
converge only on those locations at which negative charge exists. 

You can gain additional understanding of the behavior of the electrostatic 
held by considering the curl of E for a point charge. Since Eg and Eq are both 
zero, the curl in spherical coordinates becomes 
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V x E = 


1 1 dE r . 1 

• ~VT~^ + ~~ 

r sm0 dtp r 

1_L 9_ fkeq\ 

r sin0 dtp V r ) 




<P 


= 0 . 


This is not a surprising result in light of the radial nature of the electrostatic 
field of a point charge. 

As mentioned in Chapter 2, vector fields with zero curl are called irro- 
tational, and such fields have several important properties. One of those 
properties arises from the fact that the curl of a gradient is always zero: an 
irrotational vector field may always be written as the gradient of a scalar field. 

In the case of electrostatic fields, the electric field may be written as the gra- 
dient of the scalar electric potential (usually written as tp or V ). By convention, 
the electric field is written as the negative gradient of the scalar potential, so 
you’re likely to see this relationship written as 

£ = - VV, (3.31) 

where V is the scalar electric potential with units of Nm/C (equivalent to joules 
per coulomb or volts). 

Since the electric field is the negative of the change in electric potential 
with distance, moving along an electric field line in the direction it’s pointing 
means that you’re moving toward a region of lower electric potential. Likewise, 
moving in the opposite direction (opposite to the direction of the field) takes 
you into a region of higher potential, and moving perpendicular to the field 
lines results in no change in potential. Hence the “equipotential” surfaces are 
always perpendicular to the electric field lines. 

Another differential vector operation useful in electrostatics is the Laplacian 
(V 2 ). Recall that the Laplacian involves the second spatial derivative, specif- 
ically the divergence of the gradient. Since the electrostatic field E may be 
written as the negative of the gradient of the scalar potential V , taking the 
divergence of the electric field gives: 

V o E = V o (-VV) = -V 2 y. (3.32) 

Since Gauss’s Law says that the divergence of the electrostatic field must equal 
p/e o, this means 

V 2 V = -p/eo. (3.33) 

This is known as Poisson’s Equation. Since the Laplacian finds peaks and val- 
leys of a function (locations at which the value of the function differs from the 
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average value at surrounding locations), Poisson’s Equation tells you that the 
electric potential can have local maxima and minima only at locations at which 
charge is present (that is, where p ^ 0). And if you recall that the Laplacian 
is negative at peaks and positive at valleys, you can see that positive charge 
produces a peak in electric potential while negative charge produces a valley. 
This is one reason that the electric field is taken as the negative gradient of the 
electric potential. 

In regions in which the electric charge density (p) is zero, Poisson’s 
Equation becomes Laplace’s Equation: 


V 2 P = 0, 


(3.34) 


so there are no maxima or minima in electric potential for locations with zero 
charge density. 


3.4 The magnetic field 


In this section, you can read about the behavior of the magnetic field ( B ) and 
the magnetic force on a moving charged particle. You’ll also find a discus- 
sion of the application of the vector operations of divergence and curl to the 
magnetostatic field. 

Unlike electrostatic field lines, which diverge from positive charge and con- 
verge on negative charge, magnetic field lines form circles around the electric 
current (flowing charge) that is producing the magnetic field. And just as 
stationary source charges produce electrostatic fields, stationary currents (in 
which the charge flow is constant) produce magnetic fields that are called 
“magnetostatic.” An example of such a field is shown in Figure 3.20. The 
direction of those field lines is determined using the right-hand rule: if you 
put the thumb of your right hand along the direction of current flow and curl 


Current-carrying 
straight wire 



B 

(into page 
on this side) 


B 


(out of page 
on this side) 


Figure 3.20 Magnetic field of a long, straight wire. 
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your fingers (like you’re grabbing the current), the magnetic field points in the 
direction of your curled fingers. So if you were to reverse the direction of that 
current flow, the magnetic field lines would still form circles around the cur- 
rent, but the magnetic field lines would point in the opposite direction (as you 
can tell by observing the direction of your curled fingers when your thumb 
points in the opposite direction). 

You can tell by the spacing of the field lines in Figure 3.20 that the strength 
of the magnetic field is decreasing as the distance from the current increases. 
For a thin wire of infinite length carrying current /, the vector magnetic field 
is given by the equation 

B = ^cf>, (3.35) 

2tt r 

where /x o is a constant called the magnetic permeability of free space, r is 
the distance from the wire to the point at which the magnetic field is being 
determined, and (j> is the cylindrical-coordinate unit vector that points in the 
direction circulating around the wire. The standard (SI) unit of magnetic field 
is the tesla (T). 

Comparing the magnetic field lines around an electric current to the vector 
fields with various values of divergence and curl discussed in Chapter 2, you 
may have already guessed that magnetic fields fit into the “low divergence, 
high curl” category. Recall that electric field lines originate on positive charge 
and terminate on negative charge, and it is only at the location of those charges 
that the divergence of the electrostatic field is non-zero. And since magnetic 
field lines circulate back onto themselves rather than diverging from and con- 
verging upon specific locations, it’s reasonable to expect small values for the 
divergence of the magnetic field. In fact, the divergence of the magnetic field 
(B) is exactly zero, as indicated by Gauss’s Law for magnetic fields: 

V o B = 0. (3.36) 

You can verify this for the magnetic field of a long, straight wire by taking the 
divergence of the field in Eq. 3.35: 

Vo B = — - — = — — 

r sin0 3 <p r sin0 dip \2nr J 

= 0 . 

As you might expect from the discussion of curl in Chapter 2, the magnetic 
field around a current-carrying wire has zero curl: 
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V x B = 



/ d(rB*) \ 


/W v 

„ 1 

r 1 

JL fr — Y 

\2nr ) 

r -f 

r 

dr \ 27tr ) _ 


= 0. 


As in the case of the divergence of the electric field, which has a non-zero value 
only at locations at which charge exists, the only locations at which the curl of 
the magnetic field is non-zero are locations at which current exists (that is, at 
the singularity point r = 0). 

Other uses of vectors and vector operations come about when you consider 
the force ( Fb ) produced by a magnetic field (B) on a moving electric charge 
(q). This force is given by the vector equation 

Fb — qv x B, (3.37) 

where v is the velocity of the charged particle with respect to the magnetic 
field. The magnitude of the force is readily found using the definition of the 
magnitude of the vector cross product ( | A x B\ = \ A\ \ B | sin 0): 

\Fg\ = < 7 |u||Z?| sin0, (3.38) 

where 6 is the angle between vector v and vector B . 

Examined carefully, Eqs. 3.37 and 3.38 can tell you a great deal about how 
magnetic fields affect charged particles. Compare these equations to Eq. 3.27 
(Fg = qE), and note that there are similarities and differences between 
electric and magnetic forces: 

• Similarity: Both are directly proportional to the amount of charge (q); 

• Similarity: Both are directly proportional to the field strength ( E or B); 

• Difference: The velocity (i3) of the particle appears in the magnetic 
equation; 

• Difference: The magnetic force depends on the angle between the velocity 
and the magnetic field; 

• Difference: The magnetic force is perpendicular to both the velocity and the 
magnetic field. 

The similarities seem reasonable: both electric and magnetic forces are 
stronger if the fields are stronger and if the amount of charge is greater. Also, 
charges with opposite signs feel forces in opposite directions. The first listed 
difference (the fact that the magnetic force depends on the velocity of the parti- 
cle) has the interesting consequence that a charged particle at rest with respect 
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to the magnetic field (v = 0) feels no force whatsoever from that field. And 
for particles moving with respect to the magnetic field, the faster the particle 
moves, the stronger the magnetic force becomes. 

The presence of the vector cross product in the magnetic force equation 
also has some important consequences. One of those consequences is that 
charged particles moving in a direction parallel or antiparallel to the magnetic 
field feel zero magnetic force. That’s because in both the parallel (9 = 0°) 
and antiparallel ( 6 = 180°) cases, the sine term in Eq. 3.38 is zero. So 
the closer the angle 9 between v and B is to 90°, the stronger the magnetic 
force. 

Another consequence of the vector cross product in Eq. 3.37 is that the mag- 
netic force ( Fg ) can never point in the direction of the magnetic field, since the 
vector result of the cross product is by definition perpendicular to both vectors 
forming the product ( v and B in this case). For this same reason, the magnetic 
force can never point in the direction of the particle’s velocity vector, and must 
in fact be perpendicular to that vector. So if you imagine the flat plane formed 
by the velocity vector and the magnetic field, you can be sure that the magnetic 
force (if any) must be perpendicular to that plane. 

If you’ve read the discussion of radial and tangential acceleration in Sec- 
tion 3.2, you should understand that this means that magnetic fields can 
provide radial but never tangential acceleration to a charged particle (since 
tangential acceleration requires a component of force that’s either parallel 
or antiparallel to the velocity vector). And since v x B always points per- 
pendicular to v, magnetic fields can provide only radial acceleration. Thus 
magnetic fields may change the direction but never the speed of charged 
particles. 

An example of the geometry involved in magnetic force is shown in 
Figure 3.21. In this figure, the direction of the magnetic field is into the page, as 




B 



® ® < 8 > ® 


Figure 3.21 Charged particle moving to right; magnetic field into page. 
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Force in same direction as v x B if q positive 

Push v into B (into page) 
with right hand; thumb 
shows direction ofvxB 


Force in opposite direction from v x B if q negative 
Figure 3.22 Magnetic force for positive and negative charges. 

indicated by the crosses inside circles, 1 1 and the charged particle (q) is moving 
to the right. 

To determine the direction of the magnetic force in this case, you simply 
have to imagine forming the vector cross product v x B using the right-hand 
rule, as shown in Figure 3.22. Once you know the direction of v x B, it’s 
very important to remember (but easy to forget) that you must then reverse 
the direction if the charge q is negative (since by Eq. 3.37, Fg = qv x B, 
meaning that the magnetic force is opposite to the direction of v x B if 
q is negative). This explains why two directions for the magnetic force Fg 
are shown in Figure 3.22: upward if q is positive and downward if q is 
negative. 

Once you understand the direction of the magnetic force relative to the 
velocity of the charged particle, it should help explain why you may have 
heard or read about charged particles “circling around magnetic field lines” or 
perhaps “spiralling along the magnetic field.” Consider the positively charged 
particle q in Figure 3.23. If this particle is initially at the leftmost position in 
the figure, travelling with velocity v straight up the page, and the magnetic field 
B points directly out of the page, the direction of the magnetic force qv x B is 
initally to the right (as you can determine using the right-hand rule). This force 
causes the particle to travel on the dashed path to the topmost position in the 
figure. At that point, the magnetic force Fg points straight down the page. Just 
as at the previous position, since q is positively charged, the magnetic force 
points in the same direction as i; x B. This now-downward force causes the 
particle to travel to the rightmost position, at which point the velocity is straight 



This is common notation in physics and engineering; you can remember it by thinking of a 
hunter’s feathered arrow. Seen from the back, you can see the back edges of the feathers, so it 
looks like this: But seen from the front, you can see the arrow’s point, so it looks like 
this: O. 
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Figure 3.23 Magnetic force on positive charge. 


down the page and the magnetic force Fg points directly to the left. This force 
causes the particle to reach the bottom position in Figure 3.23, at which point 
the velocity is to the left and the magnetic force points straight up the page. 
Under the influence of this force, the particle will travel back to the starting 
(leftmost) position, and the entire cycle will repeat. So this positively charged 
particle makes a clockwise circle around the outward-pointing magnetic 
field. 

Applying the same reasoning to a negatively charged particle, you should 
be able to determine that it will make counter-clockwise circles around the 
same outward-pointing magnetic field. And if the field direction is reversed, 
so that B points into the page rather than outward, the sense of the parti- 
cle’s rotation will be reversed (so that a positively charged particle will circle 
counter-clockwise and a negatively charged particle will circle in the clockwise 
direction). 

The particles in these examples retrace the same path over and over, so what 
makes some particles “spiral around” the lines of the magnetic field? Simply 
this: the particle’s velocity must have a component parallel (or antiparallel) to 
the direction of the magnetic field. Note that the particle shown in Figure 3.23 
is moving entirely in the plane of the page, and the magnetic field is perpen- 
dicular to the page. Flence the particle’s velocity vector has no component 
along the magnetic field (into or out of the page). If such a component were 
present, the particle would have a component of its motion along the field 
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lines while also circling around them. In that case, the circular path shown in 
Figure 3.23 would move into or out of the paper over time, and the circle would 
become a spiral. The magnetic field has no effect on the velocity component 
(u||) parallel or antiparallel to the field (since there’s no magnetic force in that 
direction), so the speed with which the particle moves along the field line is 
constant as long as no other forces are acting. 


3.5 Chapter 3 problems 

3.1 Solve the box-on-a-ramp problem (that is, find the acceleration of 
the box) for the frictionless case using a Cartesian coordinate system 
for which the y-axis points vertically upward and the x-axis points 
horizontally to the right. 

3.2 The maximum force of static friction is /i s F n , where /x v is the coefficient 
of static friction and F n is the normal force. How big must the coefficient 
of static friction p s be to prevent a box of mass m from sliding down a 
ramp inclined 20 degrees from the horizontal? 

3.3 If a delivery woman pushes a box of mass m up a 2 m ramp with a force 
of ION, how fast is the box moving at the top of the ramp if the ramp 
angle to the horizontal is 25 degrees and the coefficient of kinetic friction 
is 0.33? 

3.4 If the hammer-thrower shown on the cover of this book wishes to 
launch a hammer of mass 7.26 kg on a cable of length 1.22 m with a 
speed of 22 m/s, what is the magnitude of the centripetal force he must 
supply? 

3.5 Imagine a Formula 1 car going around a curve with radius of 10 m while 
slowing from a speed of 180 mph to 120 mph in 2 s. What are the magni- 
tude and direction of the car’s acceleration at the instant the car’s speed 
is 150 mph? 

3.6 If three electric charges q i, qj, and <73 have the values and locations 
shown in Figure 3.18, find the electric field they produce at the origin 
(x = 0 , y = 0 ), then use your value of the field to determine the electric 
force on an electron at that location. 

3.7 If the vector electric field E in some region is given in spherical coordi- 
nates by j r + 2 sin 9 cos </> 0 — ^ sin 6 cos (p cp (N/C), what is the volume 
charge density p in that region? 

3.8 If the scalar electric potential V in some region is given in cylindrical 
coordinates by Vir, cp, z) = r 2 sirup e~ 2 ^ z , what is the electric field E in 
that region? 
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3.9 For the scalar electric potential V of Problem 3.8, use Poisson’s Equation 
to find volume charge density p in that region. 

3.10 Find the magnitude and direction of the magnetic force on a charged 
particle with charge — 4nC and velocity v = 2.5 x 10 4 ; + 1.1 x 10 4 y 
(m/s) if the magnetic field in the region is given by B — 1.2 x 10 -3 i + 
5.6 x 10 -3 j — 3.2 x 10“ 3 £(T). 


4 

Covariant and contravariant 
vector components 


The vector concepts and techniques described in the previous chapters are 
important for two reasons: they allow you to solve a wide range of problems 
in physics and engineering, and they provide a foundation on which you can 
build an understanding of tensors (the “facts of the universe”). To achieve that 
understanding, you’ll have to move beyond the simple definition of vectors as 
objects with magnitude and direction. Instead, you’ll have to think of vectors 
as objects with components that transform between coordinate systems in spe- 
cific and predictable ways. It’s also important for you to realize that vectors 
can have more than one kind of component, and that those different types of 
component are defined by their behavior under coordinate transformations. 

So this chapter is largely about the different types of vector component, 
and those components will be a lot easier to understand if you have a solid 
foundation in the mathematics of coordinate-system transformation. 


4.1 Coordinate-system transformations 

In taking the step from vectors to tensors, a good place to begin is to con- 
sider this question: “What happens to a vector when you change the coordinate 
system in which you’re representing that vector?” The short answer is that 
nothing at all happens to the vector itself, but the vector’s components may be 
different in the new coordinate system. The purpose of this section is to help 
you understand how those components change. 

Before getting to that, you should spend a few minutes considering the 
statement that the vector itself doesn’t change if you change the coordinate 
system. This may seem obvious in the case of scalars - after all, whether you 
measure temperature in Celsius or Fahrenheit doesn’t make a room feel hot- 
ter or colder. Now remember that vectors are mathematical representations of 
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physical entities, and those entities don’t change just because you change the 
coordinate system in which you’re representing them. Think about it: does the 
size of a room change if you tilt your head to one side? Clearly not. But if you 
use your tilted head to define up and down, then the points you designate as 
the top and bottom of the room may change, and this will change what you 
call the “height” and “width” of the room. The important idea is that the room 
itself doesn’t change (it “remains invariant”) under such a change of coordinate 
system. And if you define the center of your head to be the origin of your coor- 
dinate system, then walking toward one wall will “offset” the room (that is, the 
x, y, and z values of locations within the room may change), but once again 
the room itself is unchanged. Likewise, specifying dimensions of the room in 
inches rather than meters will allow you to put larger numbers in the real-estate 
ad, but that doesn’t mean your room will hold a bigger sofa. 

So if coordinate-system transformations such as rotation, translation, and 
scaling leave physical quantities unchanged, what exactly does happen to a 
vector when you transform coordinates? To understand that, consider the sim- 
ple rotation of the two-dimensional Cartesian coordinate system shown in 
Figure 4.1. In this transformation, the location of the origin has not changed, 
but both the x- and v-axis have been tilted counter-clockwise by an angle 9. 
The rotated axes are labeled x' and y' and are drawn using dashed lines to 
distinguish them from the original axes. 

What impact does this rotation have on a vector in this space? Take a look 
at vector A and its components in Figure 4.2(a) and (b). Note that the rotation 
has no effect on the length or direction of A (at first glance, A may look a 
bit different in Figure 4.2(a) and 4.2(b), but you can verify using a ruler and 
protractor that the vector itself is exactly the same). But the rotation has clearly 
caused the components of A to change: A' x (the x'-component of A in the 
tilted coordinate system) is longer than A x , and A' is shorter than A y . If you 
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Figure 4.1 Rotation of 2-D coordinate system. 
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Figure 4.2 Change in vector components due to rotation of coordinate 
system. 


were to continue rotating your axes in the same direction, you’d eventually 
reach an angle at which A lies entirely along the Y-axis, at which point the 
/-component of A would vanish (that is, A' = 0) and the ^'-component would 
equal the length of A (A' x = | A|). 

Finding the change in the components of a vector due to rotation of the 
coordinate axes can be done both graphically using simple geometry and 
analytically using the dot product. You’ll find the graphical approach in this 
section; the analytical approach is the subject of one of the problems at the end 
of this chapter. 

If you think about the changes to A x and A y in Figure 4.2, you might come 
to realize that the vector component A' x in the rotated coordinate system cannot 
depend entirely on the component A x in the original system. After all, A x 
contains some but not all of the information about vector A; the rest is in A v . 
And as the axes rotate, the axis that had pointed exclusively in the x -direction 
now points partially in the (former) y-direction. So it seems reasonable that 
the portion of A that had previously pointed in the original y-direction (and 
so contributed only to A y ) now points partially in the ^'-direction, and hence 
contributes to the Y-component as well as the /-component. 

You can see how this works in Figure 4.3. The (a) portion of this figure 
shows how the vector component A x in the original (non-rotated) coordinate 
system contributes to A' x in the rotated system, and the (b) portion shows how 
the vector component A v in the original system contributes to A' x in the rotated 
system. 

As you can see in both portions of the figure, A' x can be considered to be 
made up of two segments, labeled l\ and lx- So 


A' x — 1 1 + € 2 , 


(4.1) 
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and to determine how these segments depend on A x and A y , consider the right 
triangles shown in Figure 4.3. In the (a) portion of the figure, you can see that 
A x is the hypotenuse of a right triangle formed by drawing a perpendicular 
from the end of A x to the x'-axis. Call the angle between the x-axis and the 
x'-axis an (the reason for using double subscripts will become clear when 
rotations are written in matrix notation). Then the length of i\ (the projection 
of A x onto the x'-axis) is A x cos(an). Hence 

i\ = A^cos(an). (4.2) 

To find the length of l 2 , consider the right triangle shown in Figure 4.3(b). 
In this case, the triangle is formed by sliding A' x upward along the y'-axis 
and then drawing a perpendicular from the tip of A' x to the x-axis. From this 
triangle, you should be able to see that 

l 2 = A v cos(a 12 ), (4.3) 

where a 12 is the angle formed by the tips of A\ and A v (which is also the angle 
between the x'-axis and the y-axis, as you can see from the parallelogram in 
Figure 4.3(b). 

Adding the expressions for l\ and l 2 , you can write A\ as 

A' x = A x cos(an) + A y cos(ai 2 ), (4.4) 

where A x and A y are the components of vector A in the non-rotated coordinate 
system, a\ \ is the angle between the x'-axis and the x-axis, and a 12 is the angle 
between the x'-axis and the y-axis. You should note that the new component 
( A r x ) is a weighted linear combination of the original components (A x and 
A y ). “Weighted” because the cosine factors determine how heavily each of the 
original components contributes to the new one, “linear” because the original 
components appear to the first power only, and “combination” because both 
A x and A y contribute to A' x . 

A similar analysis for A' y , the y-component of vector A in the rotated 
coordinate system, gives 

A' y = A x cos(a 2 i) + Ay cos(a 2 2 ), (4.5) 

where a 2 \ is the angle between the y'-axis and the x-axis, and a 22 is the angle 
between the y'-axis and the y-axis. 

The relationship between the components of vector A in the rotated and 
non-rotated systems is conveniently expressed using vector/matrix notation 1 as 

1 Remember, there’s a review of matrix notation and algebra on the book’s website. 
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( A ' x \ = ( cos ( ai1 ) cos («i2) A x \ 

\ A' y ) V cos (a 2 i) cos ((* 22 ) / \ A y ) ' 

This is called a “transformation equation” for the components of vector A, and 
the two-column matrix is called a “transformation matrix.” The elements of 
that matrix are called the “direction cosines.” Note that for a rigid rotation of 
the Cartesian axes through angle 9, the angles an and a .22 are both equal to 9, 
while a 12 = 90° — 9 and a 2 i = 90° + 9. The transformation matrix in this 
case is 

/ cos ( 9 ) cos (90° — 9) \ _ ( cos ( 9 ) sin ( 9 ) \ 

V cos (90° + 9) cos (9) ) = \ - sin ( 9 ) cos (9) ) ’ ( } 

since cos(90° — 9) = sin(0) and cos(90° +9) = — sin(0). 

To understand how this works in practice, consider vector A given as 

A = 5i + 3 j (4.8) 

in a two-dimensional Cartesian coordinate system. Now imagine that the 
x- and y-axes of that coordinate system are rotated counter-clockwise by 150°, 
as shown in Figure 4.4. 

Before jumping to the equations to find the components A' x and A' y in the 
rotated coordinate system, it’s worth a few minutes to take a look at the diagram 
to estimate what the effect of the rotation on the components will be. From 
Figure 4.4(b), it’s pretty clear that both the A! x and A' y components will be 
negative, and the A' y component appears to be somewhat larger than the A' x 
component. 



Figure 4.4 2-D Cartesian axes rotated by 150°. 
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Figure 4.5 Angles between original and rotated axes. 


Now that you have an idea of what to expect, you can insert the relevant 
values into Eq. 4.6. You know that A x = 5 and A y = 3, and using the angles 
shown in Figure 4.5, you should be able to see that an = 150°, a \2 = 60°, 
a2i = 240°, anda22 = 150°. 

So you have 

/ A' x \ _ / cos (150°) cos (60°) \ / A x 
V A' y ) ~ V cos (240°) cos (150°) ) V A_ v 

or 

A' x = 5cos(150°) + 3cos(60°) = -2.8, 

and 

A' y = 5 cos(240°) + 3 cos(150°) = -5.1. 

As a quick visual analysis suggested, both components are negative and the 
/-component is larger than the /-component in the rotated system. 

It is very important for you to understand that the transformation equation 
(4.6) does not rotate or change the vector A in any way; it determines the values 
of the components of vector A in a new coordinate system. This distinction is 
important because you may be tempted to apply this transformation matrix to 
basis vectors such as i (1,0) and j (0, 1), which for a counter-clockwise 150° 
rotation gives for i 


(4.9) 

(4.10) 

(4.11) 


104 


Covariant and contravariant vector components 


\/ 1 \ 1 cos (150°) + 0 cos (60°) \ 

) V 0 / ~~ V 1 cos (240°) + 0 cos (150°) ) 

0 cos (150°) + 1 cos (60°) \ 

) V 1 /V Ocos (240°) + 1 cos (150°) ) 

= (-o 0 L)- (4i3 > 

There’s nothing inherently wrong with doing this, as long as you remember 
what the results mean: these are the components of the original unit vectors 
i and j (that is, the ones in the non-rotated coordinate system) expressed in 
terms of the rotated coordinate axes, as you can see in Figure 4.6. These are 
not the unit vectors V and y which point in the direction of the x and y'-axes 
(remember that in the primed coordinate system, the unit vectors V and j', 
pointing along the rotated coordinate axes, must have components (1,0) and 
(0, 1), respectively). 

Rigid rotation of Cartesian axes is only one type of the myriad coordinate 
transformations that can change the components of a vector. But as long as the 
new components can be written as weighted sums of the original components, 
the transformation is linear and can be represented by a matrix equation. For 


cos (150°) cos (60°) 
cos (240°) cos (150°) 


and for j 


cos (150°) 
cos (240°) 


cos (60°) 
cos (150°) 


y -component 
of i in rotated 
coordinate system / 



x'-component 
of j in rotated ' ^ 

coordinate system 


c. 

/ 

x 



x -component 
of i in rotated 
coordinate system 

(a) 


y -component 
of j in rotated 
coordinate system 


(b) 


Figure 4.6 Components of i and j in rotated coordinate system. 
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reasons that will become clear when you read Section 4.3 of this chapter, such 
transformations of vector components are called “inverse” or “passive” trans- 
formations, which means the matrix equation of such a transformation will 
look like this: 

( Components of \ / Inverse \ / Components of 

same vector I = I transformation I vector in 

in new system / \ matrix / \ original system 

(4.14) 

At this point, you may be wondering how you might go about transform- 
ing the unit vectors of the original (non-rotated) system (that is, i and j ) into 
the unit vectors of the primed (rotated) system (i 1 and /'). That’s a different 
question, because you’re no longer asking, “Given the components of a vector 
in one coordinate system, how do I find the components of that same vector 
in a different coordinate system?” Instead, you’re asking, “How do I change a 
given vector (in this case, a unit vector in one coordinate system) into a differ- 
ent vector (the unit vector in a different coordinate system)?” That question is 
addressed in the next section. 



4.2 Basis-vector transformations 

The previous section illustrated what happens to the components of a vector 
when the two-dimensional Cartesian axes are rotated, and the results are not 
surprising: the components of the vector referenced to the new (rotated) axes 
are different from the components referenced to the original (non-rotated) axes. 
More specifically, the new components are weighted linear combinations of the 
original components. 

Now here’s a very important point: as your studies carry you along the 
path from vectors to tensors, you will undoubtedly run across discussions of 
“covariant” and “contravariant” vector components. In those discussions, you 
may see words to the effect that covariant components transform in the same 
way as basis vectors (“co” ~ “with”), and contravariant components trans- 
form in the opposite way to basis vectors (“contra” ~ “against”). As you’ll 
see later in this chapter, there’s plenty of truth in that description, but there’s 
also a major pitfall. That’s because the “transformation” of basis vectors usu- 
ally refers to the conversion of the basis vectors in the original (non-rotated) 
coordinate system to the different basis vectors which point along the coordi- 
nate axes in the new (rotated) system, whereas the “transformation” of vector 

- These components are identical in the Cartesian coordinate systems considered so far. 
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components refers to the change in the components of the same vector referred 
to two different sets of coordinate axes. The potential for confusion here is suf- 
ficiently great to cause Schutz to write that "the reason that ‘co’ and ‘contra’ 
have been abandoned is that they mix up two very different things.” 3 Schutz 
wrote that in 1983, and for better or worse, the “covariant/contravariant” ter- 
minology is still with us - that’s why in this book you’ll find those words as 
well as more modern terminology. 

Why did the “covariant/contravariant” terminology take hold in the first 
place? Probably because the process of changing a vector into a different vec- 
tor has much in common with the process of transforming the components of 
a vector from one coordinate system to another. This section shows you how 
to make a new vector using rotation (specifically, how to rotate basis vectors). 

To understand the process of rotating a vector, consider vector A in 
Figure 4.7(a). The rotation shown in Figure 4.7(b) causes vector A to point 
in a different direction, which means it is no longer the same vector (which 
is why it’s labeled A! after the rotation). The relationship between the com- 
ponents of the original (non-rotated) vector and the new (rotated) vector can 
be found rather easily through geometric constructions such as those shown in 
Figure 4.8. In this example, the rotation angle is a. The x- and y-components 
of vectors A and A' are 

A x = |A| cos (0), A' x = | A' | cos(0'), 

Ay = \A\ sin(0), A' y = \A'\ sin(0'). 

But O’ = a + 0, so the components A[ and A' y are 

A' x = | A’\ cos(a + 0) = | A’\ [cos(a) cos(0) — sin(a) sin(0)] , 

A’ y — \A’\ sin(a + 0) = | A'\ [sin(a) cos(0) + cos(a) sin(0)] . 

Since the length of A must be the same as the length of A’ (the vector rotated 
but did not change length), you can write |A| = A'|, which means that 



Figure 4.7 Rotation of a vector. 

3 Schutz, B., A First Course in General Relativity, p. 64. See further reading. 
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Figure 4.8 Angles involved in the rotation of a vector. 


A' x = | A’\ [cos(a) cos(<9) — sin(o0 sin(0)] 

= | A| cos(o!) cos(0) — | A| sin(a) sin(6>), 

A' y = | A'\ [sin(a0 cos (6) + cos(a) sin(0)] 

= | A| sin(o!) cos (9) + | A| cos (a) sin(0). 

But | A | cos((9) is just A x and | A\ sin($) is A y , so you can write 

A' x — A x cos(a) — Ay sin(o!), 

A' y — A x sin(a) + A y cos(a), 

or, as a matrix equation, 

/ K \ _ ( cos (a) - sin(a) \ / A x 
\ A' y ) \ sin(a) cos (a) J \ A y 

which tells you how to find the components A' x and A' y of the new vector (A') 
in the original coordinate system. 

To see how this works in practice, consider a rotation such as the one shown 
in Figure 4.7, but through a larger rotation angle of a = 150°. If the original 
vector is given by A = A x i + A y j — 5t + 3 j, then 

( A ' x \ = ( cos (150°) — sin(150°) \ / 5 \ = / -5.83 \ 

V A' y J V sin(150°) cos (150°) J\3 J V -°- 10 J’ ( J 

so the new vector A! = —5.83 1 — 0. 1 ()/. This means that by rotating vector 
A through 150°, you’ve produced a new vector that lies almost entirely along 
the negative v-axis (you can see this by noting that the x-component is neg- 
ative and much larger than the v-component). Remember that this is a new 
vector expressed using the same basis ( i and j) and is not the same vector 
expressed using a new basis (because in this case you rotated the vector, not 
the coordinate system). 
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x-component 



Figure 4.9 Components of V and ]' in original (unrotated) coordinate system. 


You can, of course, rotate the basis vectors i and j using this same approach. 
This can be helpful if you’re faced with a problem involving a rotated coordi- 
nate system and you wish to express the basis vectors pointing along the axes 
of the rotated system in terms of the basis vectors in the original (non-rotated) 
system. For example, to rotate the ; unit vector by 150° counter-clockwise, you 
can use 

( V x \ = ( cos(150°) — sin(150°) \ /l\/ -0.866 \ 

V iy ) V sin(150°) cos(150°) ) V 0 ) V 0.5 ) ’ 1 ' ’ 

where i' x represents the x -component of the 150°-rotated i vector and i' repre- 
sents the y-component of the rotated i vector, as shown in Figure 4.9(a). You 
can also rotate the j unit vector by the same angle using 

( Tx \ = ( cos(150°) — sin(150°) \ / 0 \ / -0.5 \ 

V jy J V sin(150°) cos(150°) J \ 1 J \ -0.866 J ’ K ' ’ 

where j’ x represents the x-component of the 150°-rotated j vector and j' y 
represents the v-component of the rotated j vector, as shown in Figure 4.9(b). 

Just as in Eq. 4.15, the new components of the V and j' vectors are expressed 
in the same coordinate system as the original i and j . As pointed out in the 
previous section, the components of V and j' in the rotated coordinate system 
must be (1, 0) and (0, 1). 

So if you wish to transform a set of basis vectors into new basis vec- 
tors (pointing along different coordinate axes), you use a “direct” or “active” 
transformation matrix, and the matrix equation looks like this: 


4.3 Basis-vector vs. component transformations 
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/ New basis \ ( . . 1 / Original basis \ , „ „ 

= transformation & . (4.19) 

V vectors \ I \ vectors ] 

v 7 \ matrix / v 7 

Comparing this to Eq. 4.14 should help you understand that transforma- 
tion matrices can be used for two different but related operations: finding the 
components of the same vector in a new coordinate system or finding the 
components of a different vector (such as a new basis vector) in the original 
coordinate system. The next section presents a comparison of these two types 
of transformation matrix. 


4.3 Basis- vector vs. component transformations 

Since Eq. 4.14 and Eq. 4.19 both involve transformation matrices, it’s natural 
to wonder how those transformation matrices might be related. You can find a 
clue to that relationship by comparing the transformation matrix in Eq. 4.7 
(pertaining to component change due to a coordinate-axis rotation through 
angle 9) with that of Eq. 4.15 (pertaining to basis-vector rotation through angle 
9). Extracting the transformation matrix from each of those equations gives: 

From Eq. 4.7: 

/ cos(0) sin(0) \ 

\ — sin(0) cos(0) ) 

\ 


Transformation matrix for finding compo- 
nents of same vector as coordinate system 
is rotated through angle 9 


From Eq. 4.15: 

/ cos(0) — sin(0) \ 
\ sin(0) cos(0) J 

\ 


Transformation matrix for finding new 
basis vectors by rotating original basis vec- 
tors through angle 9 
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Multiplying these two matrices reveals the nature of the relationship 
between them: 

/ cos(0) sin(0) \ / cos(0) - sin(0) \ / 1 0 \ 

v - sin(0) cos(0) ) V sin(0) cos(0) ) ~ \ 0 1 ) ' 

This means that in this case the component-transformation matrix is the inverse 
of the basis-vector transformation matrix (since multiplying a matrix by its 
inverse produces the identity matrix). The fact that in this case the transpose of 
the transformation matrix is equal to its inverse means that this transformation 
matrix is “orthogonal” (converting from one Cartesian coordinate system into 
a different one). 

In light of the inverse relationship between the basis-vector transformation 
matrix and the vector-component transformation matrix, you might say that in 
this case the vector components transform inversely to or “against” the man- 
ner in which the basis vectors transform (provided that you remember that by 
“components transform” you mean finding the components of the same vec- 
tor in the new coordinate system, and by “basis vectors transform” you mean 
rotating the basis vectors to point along different coordinate axes). 

You should also remember that rotation of Cartesian coordinate axes is only 
one among many possible forms of transformation. In general, any time you 
choose to switch from one set of basis vectors to another, you must consider 
the effect of your choice of new basis vectors on the components of the vectors 
in your system. How the matrix that transforms the original basis vectors into 
the new ones relates to the matrix that converts the vector components depends 
on the type of component you’re using to represent the vector. 

If you’re surprised to learn that there can be more than one type of compo- 
nent for a given vector, you should consider a coordinate system in which 
the axes are not perpendicular to one another. You can learn about such 
“non-orthogonal” coordinate systems in the next section. 


4.4 Non-orthogonal coordinate systems 

In Cartesian coordinate systems, there’s no chance for ambiguity when you 
consider the process of “projection” of a vector onto a coordinate axis. Using 
the light source and shadow approach described in Chapter 1 , you simply imag- 
ine a source of light shining on the vector and the shadow produced by that 
vector on one of the coordinate axes, as in Figure 1.6. In two-dimensional 
Cartesian coordinates, the direction of the light may be specified in one of two 
equivalent ways: parallel to one of the axes (actually antiparallel since the light 
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Figure 4.10 Projections using light sources parallel to x- and y-axes. 


shines back toward the origin), or perpendicular to the other axis. For example, 
in Figure 1.6(a), you’re saying exactly the same thing if you describe the light 
as shining “antiparallel to the y-axis” or “perpendicular to the ji-axis.” 

Now imagine a two-dimensional coordinate system in which the x- and 
y-axes are not perpendicular to one another. 1 In such cases, the process of 
projecting a vector onto one of the coordinate axes takes on an additional com- 
plication. Should the light sources shine (anti-) parallel to the coordinate axes, 
as in Figure 4.10, or perpendicular to the axes, as in Figure 4.11? 

In each case, a “projection” of the vector is formed onto one of the coordi- 
nate axes, but those projections may have quite different lengths, as you can 
see by comparing the lengths of the “shadows” cast in Figure 4.10 to those in 
Figure 4.1 1. 

You may certainly be forgiven for thinking “So what?” when confronted 
with these differing projections. Does it really matter that there are two ways 
to project a vector onto an axis in non-orthogonal coordinate systems? 

One indication that the type of projection does matter comes about if you 
attempt to use vector addition to form vector A from the projection compo- 
nents using the rules of vector addition. As you can see in Figure 4.12, that 
process works perfectly if you use the parallel-projection components but fails 
miserably when you attempt to use the perpendicular-projection components. 

This is not just an academic exercise; non-orthogonal coordinate axes turn up quite naturally in 
problems in relativity, fluid dynamics, and other areas. 
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Figure 4.11 Projections using light sources perpendicular to x- and y-axes. 



Figure 4.12 Vector addition of components formed by parallel and perpen- 
dicular projection. 


This may cause you to wonder why the perpendicular-projection components 
are called “components” at all. 

Another way to appreciate the significance of the difference between paral- 
lel and perpendicular projections is to consider how the components formed by 
these two types of projection transform between coordinate systems. As you’ll 
see later in this chapter, the components formed by projections perpendicular 
to the coordinate axes transform between coordinate systems using the direct 
transformation matrix that is also used to form the new basis vectors in the 
new coordinate system, while the components formed by projections parallel 
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to the coordinate axes transform between coordinate systems using the inverse 
transformation matrix. This behavior has caused the perpendicular-projection 
components to traditionally be called the “covariant” components of the vec- 
tor, while the parallel-projection components are called the “contravariant” 
components of the vector. Of course, for orthogonal coordinate systems, the 
direction parallel to one of the coordinate axes is exactly the same as the direc- 
tion perpendicular to other axes, so in that case the covariant and contravariant 
components of a vector are identical, and no distinction is needed. 

To learn why the covariant values are called “components,” and, much more 
importantly, to understand why covariant and contravariant components are 
meaningful quantities and how they may be used to write physical laws that do 
not depend on the reference frame of the observer, you should first understand 
the concept of dual basis vectors. You can read about such basis vectors in the 
next section. 


4.5 Dual basis vectors 

For non-orthogonal coordinate systems, it’s clear from geometric considera- 
tions such as those illustrated in Figure 4.12 that the perpendicular projections 
of a vector onto the coordinate axes do not form “components” in the way 
that parallel projections do; the perpendicular projections simply don’t add up 
as vectors to give the original vector. But to truly understand the process of 
“adding up” components as vectors, you have to think about the role of the 
basis vectors in that addition. To see how that works for parallel projections, 
take a look at the basis vectors e\ and ej pointing along the (non-orthogonal) 
coordinate axes in Figure 4.13 and the projections of vector A onto those 
directions. In this case, vector A may be written as 

A = A x e i + A y e 2 , (4.20) 

where A x and A y represent the parallel-projection (contravariant) components 
of A. 5 

The same approach doesn’t work for the perpendicular-projection (covari- 
ant) components A x and A y , as you can tell by looking at the lengths of the 
projections in Figure 4.12(b); it’s clear that those two “components” multiplied 
by the basis vectors e\ and ej do not add up to give A. So it’s reason- 
able to wonder if there are alternative basis vectors that would allow the 

5 The use of superscripts for the “jt” and in the contravariant components A* and A y is 
deliberate and is the standard notation for distinguishing these contravariant components from 
the covariant components A x and Ay. 
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Figure 4.13 Parallel-projection components and basis vectors. 


perpendicular-projection components to form a vector in a manner analogous 
to Eq. 4.20. Happily, there are, and those alternative basis vectors are called 
“reciprocal” or “dual” basis vectors. 

Dual basis vectors have two defining characteristics. The first is that each 
one must be perpendicular to all original basis vectors with different indices. 
So if you call the dual basis vectors e 1 and e 2 to distinguish them from the 
original basis vectors e\ and ei, you can be sure that e 1 is perpendicular to 
?2 (and thus perpendicular to the y-axis in this case). Likewise, e 2 must be 
perpendicular to e\ (and thus perpendicular to the x-axis in this case). The 
directions of the dual basis vectors e 1 and e 2 are shown in Figure 4.14. 

The second defining characteristic for dual basis vectors is that the dot prod- 
uct between each dual basis vector and the original basis vector with the same 
index must equal one (so e 1 o ej = 1 and e 2 o e 2 = 1). This means that you 
can find the lengths of the dual basis vectors as long as you know the lengths of 
the original basis vectors and the angle between each dual basis vector and the 
corresponding original basis vector. 6 So to find the length of e 1 , you simply 
have to multiply the length of the original basis vector e\ by the cosine of the 
angle between e 1 and ~e\ and then take the inverse of the result. Likewise, to 
find the length of e 2 , multiply the length of the original basis vector <22 by the 
cosine of the angle between e 2 and <?2 and take the inverse of that result. Thus: 


\ei\ cos(0!)' 


(4.21) 


6 Recall from Chapter 2 that A o B = |A||B| cost), where 8 is the angle between A and B. 
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Figure 4.14 Perpendicular-projection components and dual basis vectors. 


and 


\h\ COS(0 2 )’ 


(4.22) 


where 0\ is the angle between e 1 and e\ and (h is the angle between e 2 and e 2 . 

With the concept of dual basis vectors in hand, you’re in a position to under- 
stand why the perpendicular-projection (covariant) components A x and A y 
may rightfully be called “components.” The key is that the projections must 
be made onto the direction of the dual basis vectors rather than onto the direc- 
tions of the original basis vectors. If you do that, then the covariant components 
A x and A y can be multiplied by the relevant basis vectors and added to give the 
original vector A in the same way as can be done using the parallel-projection 
(contravariant) components A x and A y . The covariant-component equivalent 
to Eq. 4.20 is thus 

A — A x e 1 + A y e 2 . (4.23) 


As you may have guessed, the use of superscripts to denote the dual basis 
vectors e 1 and e 2 is not accidental; when these basis vectors are transformed 
to a new coordinate system, the inverse transformation matrix is used, as it is 
for the contravariant vector components A x and A- v . 

Note that in a two-dimensional coordinate system with orthonormal basis 
vectors such as i and j, the dual basis vectors are identical to the original basis 
vectors along the coordinate axes. That’s easily understood, because the direc- 
tion of each of the dual basis vectors must be perpendicular to the direction 
of one of the original basis vectors (and hence must point along the x- and 


116 


Covariant and contravariant vector components 


y-axes). And since the length of the dual basis vectors must equal the inverse 
of the length of the original basis vectors times cos(0) (which is 1 /[ 1 cos(0°)] 
in this case), the dual basis vectors have the same length as well as the same 
direction as i and j . So the differences between original and dual basis vectors 
disappear for orthonormal coordinate systems, just as the distinctions between 
covariant and contravariant components disappear for such systems. 

The concept of dual basis vectors can be readily extended to three dimen- 
sions, and in that case determination of the length and direction of the dual 
basis vectors is most easily done using the dot and cross product between vec- 
tors. Specifically, the three-dimensional dual basis vectors e l ,e 2 and e 3 can 
be found from the original basis vectors e\, e 2 , and e 3 using the following 
relations: 


ei x e 3 

e\ o (e 2 x e 3 ) ’ 
e 3 x e\ 

e\ o (e 2 x e 3 ) ’ 
<?1 x e 2 

e\ o (e 2 x e 3 ) ' 


(4.24) 


Each denominator is the triple scalar product of the original basis vectors, 
which you may recall from Section 2.3 is the volume of the parallelepiped 
formed by those vectors. 

In these equations, the cross products in the numerators ensure that the first 
characteristic of dual basis vectors is met (for example, that e 1 is perpendicular 
to e j 2 and to e 3 ). The triple scalar products in the denominators ensure that the 
second characteristic is met (for example, that e 1 o e\ = 1). 

The computation of dual basis vectors may seem like a long trek to make 
simply to have an alternative way of writing vectors, but there’s a great truth to 
be found by comparing Eqs. 4.20 and 4.23. Since these equations describe the 
same vector, you may combine them to write 


A = A x e i + A y e 2 = A. v e 1 + A y e 2 , (4.25) 


which serves to emphasize an important fact. If you seek to define a quantity 
(such as vector A) that remains invariant under a transformation of coordinates, 
you have a choice: you can combine superscripted (contravariant) components 
with subscripted (covariant) basis vectors, or you can combine subscripted 
(covariant) components with superscripted (contravariant) basis vectors. That 
should seem reasonable to you, because covariant quantities transform using 
a direct transformation matrix, while contravariant quantities use an inverse 
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transformation matrix. Multiplying such quantities guarantees that the result is 
unaffected by the transformation. 

You can see an example of how dual basis vectors and covariant and 
contravariant components are determined in the next section. 


4.6 Finding covariant and contravariant components 

Once you grasp the concept of dual basis vectors in non-orthonormal coordi- 
nate systems, finding the covariant and contravariant components of a vector 
is straightforward. As an example, take a look at vector A in Figure 4.15, with 
non-orthogonal basis vectors e\ and ei- 

Finding the contravariant components A 1 and A 2 is simply a matter of 
parallel-projecting vector A onto the directions of the original basis vectors 
e\ and o, as shown in Figure 4.16. A quick visual inspection suggests that 
component A^eil should be about 2/3 the length of original basis vector e\. 



Figure 4.15 Non-orthogonal basis vectors. 



Figure 4.16 Parallel projections onto original basis vectors. 
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and component A 2 \e 2 \ should be about 1.5 times the length of original basis 
vector e 2 • The values of A 1 and A 2 can be found by writing the vector equation 

A = A 1 ei+ A 2 e 2 , (4.26) 


which can be written as two equations for the components < 


A x — A} e\ x + A~e 2 ,x, 
Ay = A} e\^y + A“e 2 ,y- 


These two simultaneous equations may readily be solved for A 1 and A 2 using 
the elimination or substitution method (both of which are demonstrated in the 
on-line solutions to the problems at the end of this chapter). Another approach 
is the matrix method and Cramer’s Rule (described in the matrix-algebra 
review on the book’s website). Using this approach, you begin by substituting 
the known values for vector A as well as e\ and e 2 : 



which may also be written as 



Now use Cramer’s Rule to find A 1 and A 2 : 


(4.27) 


(4.28) 


7 4 

2 0 

—8 , 

= 0.667, A 2 = 

-12 

1 

3 

7 

2 

-19 

1 4 

1 

4 

" -12 

3 0 


3 

0 



(4.29) 


These values are consistent with the visual estimates from Figure 4.16. 

To use the same process to find the perpendicular-projection (covariant) 
components A i and A 2 , you must first determine the length and direction of 
the dual basis vectors. You know that the direction of e 1 must be perpendicular 
to that of e 2 , and the direction of e 2 must be perpendicular to that of e\ . As for 
the lengths, first find the lengths of e\ and e 2 : 


|«il = V(l) 2 + (3) 2 = 3.16, \e 2 \ = V(4) 2 + (0) 2 = 4.00. (4.30) 


Then you can use Eqs. 4.21 and 4.22 to find \e \ and \e 2 \, but first you have 
to figure out the angle between e\ and e 1 (which is 0 \ ) and the angle between 
e 2 and e 2 (which is 0 2 )- If you look at Figure 4.17, you should be able to 
determine that 0\ — d 2 — arctan(l/3) = 18.43°, so you have 
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Figure 4.17 Perpendicular projections onto dual basis vectors 


1 

|ei| cos(0i) 

1 

\e 2 \ cos (0 2 ) 


1 

3.16cos(18.43°) 

1 

4.00 cos(18.43°) 


0.333, 

0.264. 


(4.31) 


You can see the (very short) dual basis vectors e 1 and e 2 in Figure 4.17. 
Note that e 1 is perpendicular to e 2 and that e 2 is perpendicular to e \ , and their 
lengths are given by Eq. 4.31. 

Once you have the dual basis vectors in hand, you’re in a position to find 
the perpendicular-projection (covariant) components A\ and A 2 . You can do 
this geometrically by continuing the perpendicular-projection lines beyond the 
direction lines of e\ and e 2 and onto the direction lines of e 1 and e 2 , as shown 
in Figure 4.17. The magnitude of vector A is 

|A| = V(7) 2 + (2) 2 = 7.28, (4.32) 


and the angle between A and the v-axis is arctan( = ) = 15.94°. Using this 
value and 0\ from above, you can determine that the angle between A and e\ 
is 55.62° and the angle between A and e 2 is 15.94°. So the length of 1 1 in 
Figure 4.17(a) is 

h = |A|cos(55.62°) = 4.11, (4.33) 


and 


Ail? 1 ! 


h 

cos(18.43°) 


4.33, 


(4.34) 


so Ai = 4.33/0.333 = 13.0. 
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Using the same approach to find A 2 from Figure 4.17(b) gives 


t 2 = | A| cos(15.94°) = 7.00, 


(4.35) 


and 


a 2 i? 2 \ 


h 

cos(18.43°) 


7.38, 


(4.36) 


so A 2 = 7.38/0.264 = 28.0. 

These results serve as a reminder that when you use non-normalized basis 
vectors (that is, basis vectors with magnitude not equal to one), you cannot 
equate the lengths of the projections onto the coordinate axes with the value 
of a vector’s components. That’s because those projections are the products of 
the components with the magnitudes of the basis vectors. 

If you prefer the algebraic approach to finding A 1 and A 2 , you can do that 
by proceeding as you did for A 1 and A 2 , although in this case you begin with 


A — A\e x + A 2 e 2 , 


(4.37) 


and then substitute the known values for vector A as well as the x- and y- 
components of the dual basis vectors e 1 and e 2 : 


el = \e 1 1 cos(90°) = 0.000, e\ = |e 2 | C os(360° - 18.43°) = 0.250, 
el = I? 1 1 sin(90°) = 0.333, e 2 = |e 2 | sin(360° - 18.43°) = -0.083. 


So 


= Ai 


0 

0.333 


As before, this may be written as 


0 


■ a 2 


0.25 


0.25 

-0.083 


A 1 
A 2 


0.333 -0.083 
Again using Cramer’s Rule to solve for A 1 and A 2 gives 


(4.38) 


(4.39) 


7 0.25 

2 -0.083 

A\ = 

0 0.25 

0.333 -0.083 


-1.081 


-0.083 


13.0, 


A 2 = 


0 7 

0.333 2 


0 0.25 

0.333 -0.083 


-2.331 

-0.083 


28.0, 


(4.40) 


as expected from the geometric approach. 
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A simpler approach to finding the contravariant and covariant components 
of a vector once you have both the original and dual basis vectors in hand is to 
use these relations: 

Aj — A o ei — Aj^i y -I - Aye\ y A 2 = A o £2 = ^x^2,x ~f~ Ay^2 ,_y> 

(4.41) 

and 

A 1 = A o e 1 = A x e\ + A y e\ A 2 — A o e 2 — A x e 2 + A y e 2 y . (4.42) 
In the current example, this approach gives the covariant components as 

A! = (7, 2) o (1, 3) = (7) (1) + (2) (3) = 13, 

A 2 = (7, 2) o (4, 0) = (7) (4) + (2) (0) = 28, 

and 

A 1 = (7, 2) o (0, 0.333) = (7)(0) + (2)(0.333) = 0.666, 

A 2 = (7, 2) o (0.250, -0.083) = (7)(0.250) + (2)(-0.083) = 1.58, 

in agreement with the geometric and matrix-algebra approaches taken above. 

It’s important for you to realize that what you’ve just found are the parallel- 
projection (contravariant) and perpendicular-projection (covariant) compo- 
nents of vector A with respect to the original basis vectors e\ and <22 and the 
dual basis vectors e 1 and e 2 . So does that mean that A is a covariant vector or 
a contravariant vector? 

The answer is neither (or both, if you prefer); it’s not the vector itself that 
is contravariant or covariant, it’s the set of components that you form through 
its parallel or perpendicular projections. As you read the literature on tensors, 
you’re very likely to run into expressions such as “the contravariant vector 
A” or “the covariant vector and what the author generally means is that 
the contravariant components of vector A and the covariant components of 
vector B are being used for the problem (perhaps because they’re simpler). 
But you can be sure that like all vectors, A and B both have contravariant and 
covariant components, and you can find them using the techniques described 
in this section. 

And if you’re wondering why you might want to go through the effort of 
finding those components, rest assured that the payoff is worth the effort. To 
appreciate the value of that payoff, you’ll have to begin thinking of vectors not 
just as arrows with a certain length and pointing in a specified direction, but 
rather as members of a class of objects called tensors that have very predictable 

7 In Chapter 5, you can learn to move between contravariant and covariant components using the 
metric tensor. 
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(and useful) properties under transformation of coordinates. In that view, the 
vectors you’ve been dealing with up to this point have all been tensors of rank 
one. Seeing them as such, and understanding what that means, will be made 
a great deal easier through the use of a notation called “index notation” and 
a convention known as the “Einstein summation convention.” You can read 
about index notation and the summation convention in the next section. 


4.7 Index notation 


You’ve seen the first glimmerings of index notation in the earlier section of 
this chapter describing coordinate transformations. As you may recall, the 
angles between the transformed (rotated) coordinate axes and the original 
(non-rotated) axes of a two-dimensional coordinate system were called a\\, 
a i 2 > « 2 i> and a 22 - These angles could just as well have been designated u x r x , 
a x ' y , oi v i x , and the like, but there are several good reasons to use the index num- 
bers 1, 2, and 3 rather than the letters x, y, and z to refer to coordinate axes 
and vector components. One of those reasons is that many problems in physics 
and engineering involve a number of dimensions greater than 3, and although 
everyone agrees that “4” comes after “3,” a consensus hasn’t been reached on 
what comes after “z.” Another reason is that index notation enables the great 
convenience of the summation convention that you can read about later in this 
section. 

Using index notation, the coordinates of a point in three-dimensional 
space are written as (xi,X 2 ,x 3 ) or ( x l ,x 2 ,x 3 ) rather than (x,y,z), and 
the components of a vector are written as ( A i , 4 2 . A 3 J or (A 1 , A 2 , A 3 ) 
rather than (A x , A y , A z ) or (A A , A - v , A z ). This system is easily extended to 
N-dimensional space, in which the coordinates become (x 3 , X 2 , . . . , x,v) or 
(x 1 , x 2 , . . . , x N ) and the vector components become (Ai, A 2 , . . . , A n) or 
(A 1 , A 2 , . . . , A n ). 

Applying this notation to the equation for the transformation of contravariant 
vector components produced by a rotation of two-dimensional axes, Eq. 4.6 
becomes 



(4.43) 


In three dimensions, this is 



cos (an) 
cos ( 1 x 22 ) 
cos (a 32 ) 



(4.44) 


4 . 7 Index notation 


123 


Designating the elements of the transformation matrix an, an, fli 3 , and so 
forth allows you to write Eq. 4.44 as 

A ' — an + ai 2 A~ + a 13 A 3 , 

A 2 — 021 A 1 + « 22^ 2 + CI 23 A 3 , (4.45) 

A 3 = 031 A ' + #32 A” + ( 233 A 3 , 
or 

A 1 = J2 c,l J AJ ’ 

j = 1 

3 

A 2 = Y^a 2 jA j , (4.46) 

;=i 

a ' 3 = jA 1 - 

3 = 1 

Allowing “i” to stand for any of the indices 1, 2, or 3 makes this: 

3 

A‘ = a ij A 2 ■ i = 1, 2, 3 (4.47) 

3=1 

As a final simplification, whenever an index appears twice in the same term, 
once as a superscript and once as a subscript (as “j” does in Eq. 4.47), you can 
omit the summation symbol and write simply 

A i =a ij A j , (4.48) 

in which the reader knows to sum over the repeated index ( j in this case). 
Such repeated indices are often called “dummy” indices, since any letter may 
be used for that index and the result will be the same . 8 It was Albert Einstein 
who first suggested this summation convention, which he jokingly called his 
“great discovery in mathematics .” 9 Whatever you call it, this idea certainly has 
saved a lot of ink and time since Einstein proposed it in 1916. 

Before moving on, you should take a careful look at Eq. 4.48 and make 
sure you understand that these few symbols mean exactly the same thing as 
the many terms in the three separate equations of Eq. 4.45. They tell you that 

8 Unlike the repeated “dummy” indices which indicate summation, i is called a “free” index and 
no summation is implied. 

9 Pais, A. 1983, Subtle Is the Lord: The Science and the Life of Albert Einstein, Oxford 
University Press, Oxford. 
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each component in the primed coordinate system is a weighted linear combi- 
nation of the components in the original (unprimed) coordinate system, with 
the transformation matrix elements ( a,j ) providing the weighting factors for 
each term. 

And if you want to know the exact meaning of each of those factors in 
the transformation of covariant and contravariant vector components, the next 
section will help with that. 


4.8 Quantities that transform contravariantly 

With the convenience of index notation and the summation convention at your 
disposal, you should be ready to take the next step in the transition from think- 
ing of vectors as quantities with magnitude and direction to understanding why 
vectors belong to the class of objects known as tensors. That step begins by 
asking the question of how a differential element of length ds transforms from 
one coordinate system to another. 

In general, the equations relating the coordinates in one system to those in 
another do not involve simple linear combinations of coordinate values. For 
example, in transforming from spherical (r. 9, <p) to Cartesian (x, y, z) coordi- 
nates, it’s not possible to write equations such as x = a\\r + a\i9 + a\j,tp, 
because x depends on the product of r with the sine of 9 and the cosine 
of (/>. And y and z have similar non-linear relationships to the spherical 
coordinates. 

If, however, you ask how the differentials of x, y, and z (that is, dx , dy, and 
dz) depend on the differentials of r, 9, and tp (that is, dr, d9, and dtp), you’ll 
find that on this infinitesimally small scale, dx does depend linearly on dr, d9, 
and dtp (as do dy and dz). So you are able to write 

dx = a\\dr + a\jd9 + a\idtp, (4.49) 

and likewise for dy and dz. 

For any two coordinate systems in which a linear relationship exists between 
differential length elements, writing the equations which transform between 
the systems is straightforward. If you call the differentials of one coordinate 
system dx, dy, and dz and the other coordinate system dx' , dy', and dz' , 
the transformation equations from the unprimed to the primed systems come 
directly from the rules of partial differentiation, as shown in the left column 
below: 
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, dx' dx' dx' /, dx 1 . dx 1 2 dx 1 3 

ax = — — dx + — — c/y + ~z~dz dx = — — , -dx + —rrdx + —rrdx 

(yjC @y H7 


3z 


, By' By' By' 

dV = -^—dx H t/y H t/z 

3x By 3 z 

, , dz' , dz' . dz' , 

az = —ax + —ay + — az - 
dx dy dz 


dx 1 

/ 2 dx 2 I 

> dx = r dx 

dx 1 

, '3 dx ' 3 . 1 
ax = — pax - 

dx 1 


dx 2 
dx 2 
dx 2 
dx 3 
dx 2 


dx 1 


dx~ 


dx 3 
dx 2 
dx 3 
dx 3 
dx 3 


c/x. 


dx . 


(4.50) 


Using the index-notation approach of substituting x 1 , x 2 , and x 3 for x, y, 
and z results in the column shown on the right. Putting this into matrix 
notation gives 


^ dx 1 ^ 
dx' 2 

y dx 3 j 


/ 

dx 1 

dx 1 

dx 1 

\ 



dx 1 

dx 2 

dx 3 




CN 1 

"k 

to 1 

dx' 2 

dx' 2 


/ dx 
dx 


dx 1 

dx 2 

dx 3 


\ dx 


dx' 3 

dx 3 

dx' 3 



V 

dx 1 

dx 2 

dx 3 

) 



(4.51) 


or, using individual equations with summation symbols 


dr' 1 dr' 2 

dx '' = y— dx f dx 2 = Y—dxf 

' dxJ ' BxJ 

7=1 7=1 


dx 3 = rdx 2 . 

^ BxJ 


7 = 1 


If you now allow the letter i to represent each of the numerical values of the 
index (1,2, and 3), this can be written as 


dx 



(4.52) 


Since the j index is repeated, a final simplification results from the Einstein 
summation convention, allowing you to write 

tj dx ' • 

dx 1 = — rdx 1 . (4.53) 

BxJ 

So index notation has allowed the expression in Eq. 4.50, consisting of three 
equations with three terms in each, to be written as this single equation. More 
importantly, the form of this equation will help you understand why differential 
length elements (dx 1 ) are considered to be contravariant quantities. 

10 Superscripts are used for the indices because differential length elements transform as 
contravariant quantities, as described later in this section. 
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To gain that understanding, it’s useful to recall Eq. 4.48 from the previous 
section: 

A 1 = aijA' , 

which tells you that the components of a vector in the primed (transformed) 
coordinate system are the weighted linear combination of the components 
of that same vector in the unprimed (original) coordinate system. And the 
weighting factors ajj are the elements of the transformation matrix. 

Now compare Eq. 4.53 to Eq. 4.48. On the left side of both equations, 
a primed quantity (dx ' or A 1 ) with free index i appears. On the right 
side, both equations contain the product of a factor with free index i and 
dummy index j (jC- or ajj) with the left-side quantity unprimed and with 
dummy index j (dx 1 or A 1 ). And you know that the factor ajj in Eq. 4.48 
represents the elements of a transformation matrix for contravariant vector 
components between the unprimed and the primed coordinate systems. So 

it seems reasonable to conclude that the ^-r terms in Eq. 4.53 can be 

dxJ 1 

seen as the elements of the transformation matrix for the differential length 
elements. 

So instead of looking at Eq. 4.53 as simply the index-notation version of the 
chain rule, you should see it as a transformation equation that takes differential 
length elements from the unprimed to the primed coordinate system (just as 
Eq. 4.48 does for the contravariant components of vector A). 

a 'i 

And here’s the important insight: the terms are not only the ele- 
ments of a transformation matrix from the unprimed to the primed coordinate 
system, they’re also the components of the basis vectors tangent to the orig- 
inal (unprimed) coordinate axes, expressed in the new (primed) coordinate 
system. 1 1 

Furthermore, you know that basis vectors tangent to the original coordinate 
axes are the covariant basis vectors described earlier. And since contravariant 
vector components combine with covariant basis vectors to produce invariant 
quantities, differential length elements must transform as contravariant vector 
components. This is the reason that the indices are written as superscripts in 
Eqs. 4.51 through 4.53; the differential length element is the “prototype” of 
contravariant vector components. 

Using index notation and representing the components of the basis vec- 
tors as | A-, you should now understand why the transformation equation for 
contravariant components of vector A is often written as 

1 1 If you’re wondering how partial derivatives can represent basis vectors, you should review 

Section 2.6 of Chapter 2. 
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j: dx 1 • 

A' = rA J . (4.54) 

dxJ 

Many authors present this as the definition of contravariant components. 

To see how this notation works in practice, consider the transformation 
from polar (r, 0) to two-dimensional Cartesian (x, y) coordinates. In this case, 
x'=x, x 2 — y, x l =r, and x 2 = 0, and you know that x = rcos(0) and 
y = rsin(0). So what are the weighting factors (that is, the elements of the 
transformation matrix) in this case? Taking the appropriate derivatives, you 
find that 


3x 1 3x 

— r = — *= cos(0), 
3x‘ dr 

dx 1 3x 


dx' 2 dy 

t-t = it = sin ^’ ( 4 -55) 

dx‘ dr 
dx' 2 dy 

J - f = - = rcosm. (4.56) 


Are these really the components of the tangent vectors to the original (r, 6) 
coordinate axes (that is, are they pointing along those axes)? You can see that 
they are by writing these terms as components in the primed coordinate system 
(Cartesian in this case): 


„ 3x 1 „ 3x 2 „ 


ei = -—I + — j -j = cos (0); + stn(0)7, 
dx 1 dx 1 

(4.57) 

_ 3x 1 „ 3x 2 „ 


e 2 = + —^) = ~r sin(0); + r cos (0);. 

3x z 3x z 

(4.58) 


The first of these expressions is a vector pointing radially outward (along the 
r -direction in polar coordinates) and the second is a vector pointing perpendic- 
ular to the radial direction (along the 0-direction). 1 This demonstrates that the 
partial derivatives in Eq. 4.53 do indeed represent components of the original 
(unprimed) covariant basis vectors expressed in the new (primed) coordinate 
system. 


4.9 Quantities that transform covariantly 

If the differential length element of the previous section serves as the “proto- 
type” for quantities that transform as contravariant vector components, you 
may be wondering if there’s a similar “prototype” for covariant quantities. 
You can answer that question by considering a quantity such as the change 
in temperature with distance (degrees per meter) over some region, which 
you may recognize from Chapter 2 as the gradient of that quantity. Unlike 

1 - These basis vectors can be understood in terms of the non-Cartesian unit vectors discussed in 
Section 1 .5 of Chapter 1 . 
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the differential length element, which has dimensions directly related to the 
coordinate dimensions, quantities such as the gradient have dimensions that 
include the inverse of the coordinate dimensions (per unit length rather than 
length in the case of spatial coordinates). This dimensional consideration sug- 
gests that the gradient may be a good candidate for the prototype of quantities 
that transform as covariant vector components. And index notation makes this 
easy to see. 

Imagine a scalar quantity such as temperature or density whose value at 
various positions is given by the function f (x, y, z); the rate of change of 
that quantity is in the x -direction, in the y-direction, and in the in- 
direction. It’s reasonable to ask how these rates of change vary if the coordinate 
system is changed. To answer that question, you can proceed as we did for the 
differential length element, using the chain rule for partial derivatives and then 
employing index notation as follows: 

df _ 9/ dx 9/ dy 9/ dz 
dx' dx dx' dy dx' + dz dx' 

9/ _ 9/ dx 1 9/ dx 2 9/ dx 3 

dx’ 1 dx 1 dx’ 1 dx 2 dx' 1 dx 3 dx' 1 ’ 


9/ _ 9/ dx 9/ dy 9/ dz 
dy' dx dy' dy dy' dz dy' 

9/ _ 9/ dx 1 9/ dx 2 9/ dx 3 

^ dx' 2 dx 1 dx' 2 dx 2 dx' 2 dx 3 dx' 2 ’ 


9/ _ 9/ dx df dy 9/ dz 

9?~9^9? + 9y9? + 9i9? 

9/ _ 9/ 9.V 1 df dx 2 df dx 3 
dx' 3 dx 1 dx' 3 dx 2 dx' 3 dx 3 dx' 3 

As before, you can write this as a matrix equation 


/ _df_ \ 


( dx 1 dx 2 dx 3 ^ 


( df_ \ 

dx' 1 


dx' 1 dx' 1 dx' 1 


dx 1 



dx 1 dx 2 dx 3 


df_ 

dx' 2 


dx' 2 dx' 2 dx' 2 


dx 2 



dx 1 dx 2 dx 3 


df 

K dx' 3 ) 


V dx' 3 dx' 3 dx' 3 ) 


V dx 3 ) 


(4.59) 
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or as individual equations using the summation symbol: 


= df - v dxj df 

Stc’ 1 dx' 1 dxj' dx' 2 •“ dx' 2 dxj’ 

7=1 7=1 

Once again employing i as the free index gives 

df _ JU dxJ df 

dx' 1 dx' 1 dxj ’ 

7=1 


df _ y- dxj df 
dx 3 dx'3 dxj 


(4.60) 


and the Einstein summation convention simplifies this to 


di_ = dx^df_ 

dx' 1 dx' 1 dxj 


(4.61) 


Comparing this to the equivalent expression for the differential length element 
(Eq. 4.53) suggests that once again the vector components in the primed coor- 
dinate system are the weighted linear combination of the components in the 
original coordinate system. But in this case the elements of the transformation 
matrix (%^r) are the inverse of those in the transformation of the differen- 

dx 1 

3 'i a 'i 

tial length elements (which are 'jfj)- And just as in that case the terms 
represent the components of vectors that point along the original coordinate 
axes, in this case the jy- terms represent the components of vectors that 
are perpendicular to the original coordinate surfaces. Hence in this case the 
weighting factors are the components of the (contravariant) dual basis vectors, 
which means that the components of the gradient vector transform as covari- 
ant components. Of course, for orthonormal coordinate systems the lengths 
and directions of the original and dual basis vectors are exactly the same, and 
there is no difference between the covariant and contravariant vector com- 
ponents. In non-orthonormal coordinate systems, this distinction is critically 
important. 

Again using index notation and representing the dual basis vectors as 
you probably won’t find it surprising that many authors define the covari- 
ant components of vector A as components that transform according to the 
equation 

r J 

A 'i = ( 4 - 62 ) 


At this point you should be convinced that vectors are more than just lit- 
tle arrows with magnitude and direction; they’re quantities that transform in 
certain ways between coordinate systems. Specifically, every vector has both 
contravariant and covariant components that transform in predictable ways. 
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The contravariant components vary in the opposite manner to the basis vec- 
tors pointing along the original coordinate axes, and the covariant components 
vary in the same manner as those basis vectors. Most importantly, by combin- 
ing the vector’s contravariant components with the original basis vectors, or 
by combining the vector’s covariant components with the dual basis vectors, 
the resulting quantity (the vector itself) remains invariant under all coordinate 
transformations. It is this characteristic that qualifies vectors to join the ranks 
of tensors. 

Understanding the distinction between contravariant and covariant vector 
components is extremely helpful in understanding tensors, because vectors are 
tensors. Specifically, since all the components of a vector can be delineated 
using only a single index, vectors are tensors of rank one. Under this definition, 
scalars are tensors of rank zero, since scalars are single numbers and require 
no index at all. And of what use are tensors of rank two and higher? You’ll 
encounter those in Chapter 5. 


4.10 Chapter 4 problems 

4.1 Write the inverse transformation matrix for a 70° rotation of the 2-D 
Cartesian coordinate axes and the indirect transformation matrix for the 
rotation of a vector through an angle of 70° degrees. Show that the 
product of these two transformation matrices is the identity matrix. 

4.2 Use the inverse transformation matrix from Problem 4.1 to find the 
components of vector A — 2i + 5.5/ in the rotated coordinate system. 

4.3 Use the direct transformation matrix from Problem 4. 1 to rotate the origi- 
nal coordinate basis vectors i and j by 70°, so they point along the rotated 
axes. 

4.4 Use a direct transformation matrix to rotate vector A from Problem 4.2 
through an angle of —70°, and compare the x- and v-components of 
the rotated vector (in the original coordinate system) to the x' - and 
y'-components of the unrotated vector in the rotated coordinate system. 

4.5 Use the dot product of the original vector A with the rotated basis vectors 
(A o V and A o /') to find the components of A in the rotated coordinate 
system. 

4.6 ForvectorA = —5i+6j and basis vectors e\ = f-t-2yande2 = —2 i—j, 
find the contravariant components A 1 and A 2 . 

4.7 Find the dual basis vectors e 1 and e 2 for the basis vectors e\ and of 
Problem 4.6. 

4.8 Find the covariant components A i and A 2 for vector A of Problem 4.6. 
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4.9 Use the subsitution method and the elimination method to solve the two 
simultaneous equations that result from vector Eq. 4.26. 

4.10 Show that the elements of the Cartesian-to-polar transformation matrix 
are the components of the basis vectors tangent to the original (Cartesian) 
coordinate axes. 


5 
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The previous chapter contains several ideas that are important to a full 
understanding of tensors. The first is that any vector may be represented by 
components that transform between coordinate systems in one of two ways. 
“Covariant” components transform in the same manner as the original basis 
vectors pointing along the coordinate axes, and “contravariant” components 
transform in the inverse manner of those basis vectors. 1 The second main idea 
is that coordinate basis vectors are tangent to the coordinate axes, and that 
there also exist reciprocal or dual basis vectors that are perpendicular to the 
coordinate axes; these dual basis vectors transform inversely to the coordinate 
basis vectors. The third idea is that combining contravariant components with 
original basis vectors and combining covariant components with dual basis 
vectors produces a result that is invariant under coordinate transformation. That 
result is the vector itself, and the vector is the same no matter which coordinate 
system you use for its components. 

This chapter extends the concepts of covariance and contravariance beyond 
vectors and makes it clear that scalars and vectors are members of the class of 
objects called “tensors.” 


5.1 Definitions (advanced) 

In the basic definitions of Chapter 1 , scalars, vectors, and tensors were defined 
by the number of directions involved: zero for scalars, one for vectors, and 
more than one for tensors.- Now that you’ve seen the concepts of components, 
basis vectors, and the transformation properties of each, you’re in a position 

1 The prototype of a vector expressed in contravariant components is the displacement vector, 
and the prototype of a vector expressed in covariant components is the gradient vector. 

- Note that specifying one direction in 3-dimensional space requires two angles. 
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to understand the more-advanced definitions of scalars, vectors, and tensors. 
Specifically: 


A scalar is a single value with no directional indicator that represents a 
quantity that does not vary as the coordinate system is changed. 


So for a scalar with value cp in one coordinate system and value <// in another 
coordinate system, you can be certain that the quantity represented by (j) (com- 
bined with the relevant unit) and <// (combined with its unit) is the same no 
matter which system you use to represent it. Thus 1 inch and 2.54 centimeters 
represent the same quantity of length. 


A vector is an array of three values (in 3-D space) called “vector compo- 
nents” that combine with directional indicators (“basis vectors”) to form a 
quantity that does not vary as the coordinate system is changed. 


So vector A represents the same entity whether it is expressed using contravari- 
ant components A' or covariant components A, : 

A = A'ej = Aje‘ , 


where <?,- represents a covariant basis vector and e ' represents a contravariant 
basis vector. 

In transforming between coordinate systems, a vector with contravariant 
components A 7 in the original (unprimed) coordinate system and contravariant 
components A ' in the new (primed) coordinate system transforms as 


/,■ dx 1 • 

A' = rA 7 , 

3 xJ 


r) Y l 

where the terms represent the components in the new coordinate system 
of the basis vectors tangent to the original axes. 

Likewise, for a vector with covariant components Ay in the original 
(unprimed) coordinate system and covariant components A' in the new 
(primed) coordinate system, the transformation equation is 


dx J 

A- = — 7 Ay, 
' dx' J 


where the terms represent the components in the new coordinate system 
of the (dual) basis vectors perpendicular to the original axes. 
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A tensor of rank n is an array of 3" values (in 3-D space) called “tensor com- 
ponents” that combine with multiple directional indicators (basis vectors) to 
form a quantity that does not vary as the coordinate system is changed. 


From this definition, you can see that a second-rank tensor has 3 2 = 9 compo- 
nents in three-dimensional space. Note that a tensor of rank 0 is a scalar and a 
tensor of rank 1 is a vector. 

There is no standard notation for tensors; you may see a tensor represented 

with double overhead arrows (such as T) or with a tilde or two-directional 

~ -< — y 

arrow above or below (such as T , T or J ). Many authors don’t bother with 
arrows or tildes and represent tensors simply by writing the letter signifying the 
tensor with “placeholder” indices to indicate the contravariant and covariant 
rank of the tensor (such as T ,J or T{‘). 


5.2 Covariant, contravariant, and mixed tensors 


You should by this point understand that the expression 


A 



(5.1) 


presents the contravariant components of vector A in the transformed (primed) 
coordinate system (A ' ) as a weighted sum of the components of A in the origi- 
nal (unprimed) coordinate system (A- 7 '). The weighting factors (|A-) are simply 
the elements of the transformation matrix from the unprimed to the primed 
coordinate systems, and those elements represent the components of the basis 
vectors tangent to the original coordinate axes. With that understanding, a 
tensor expression such as 


A ’ij _ dx ‘ dxJ A kl 
~ dx k 8x l 


(5.2) 


should have some recognizable elements. As you can probably surmise, in 
this expression A ' 7 are the contravariant tensor components in the new coor- 
dinate system, A kl are the contravariant tensor components in the original 

, >' 3 / 

coordinate system, and as well as y-j - are elements of the transforma- 
tion matrix between the original and new coordinate systems. And just as in 
Eq. 5.1, the elements of the direct transformation matrix also represent the 
basis vectors tangent to the original coordinate axes. But in the vector expres- 
sion Eq. 5.1 each component pertains to a single basis vector, whereas the 
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components in the tensor expression Eq. 5.2 pertain to two basis vectors. This 
should seem reasonable to you, since the basic definitions in Chapter 1 state 
that vectors involve a single direction while higher-rank tensors involve two or 
more directions. 

The vector Eq. 5.1 involves contravariant components (as indicated by the 
use of superscripted indices in A 1 and A 1 ), but you know that an equivalent 
expression exists for the covariant components: 


, dx-* 

A: = — TjAj. 
' dx 1 J 


(5.3) 


In this equation, the covariant components of vector A in the transformed 
(primed) coordinate system (A'.) are expressed as a weighted sum of the covari- 
ant components of A in the original (unprimed) coordinate system (Ay). In this 
case, the weighting factors (^y ) are the elements of the inverse transformation 
matrix from the unprimed to the primed coordinate systems, and those ele- 
ments represent the dual basis vectors perpendicular to the original coordinate 
axes. 

Extending this to a second-rank tensor gives a transformation equation such 
as this: 


A;, 


dx k dx 1 
dx 1 ' 3 x-f 


Ml- 


(5.4) 


In this expression, AC are the covariant tensor components in the new coordi- 
nate system, Am are the covariant tensor components in the original coordinate 
system, and as well as are elements of the transformation matrix 
between the original and new coordinate systems. And much as in Eq. 5.3, 
the elements of the transformation matrix represent the dual basis vectors 
perpendicular to the original coordinate axes. 

As you may have anticipated, another possibility exists for second-rank 
tensors: 


A 'i = d _^^L A k 
j dx k 3 x y 1 ’ 


(5.5) 


in which the tensor A is represented by one contravariant and one covariant 
index; each uses the transformation matrix appropriate for its type. 


5.3 Tensor addition and subtraction 

As you may recall from Section 1.4, two or more vectors can be added simply 
by adding their corresponding components. Hence a single vector equation 
such as 
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C = A + 5, 


(5.6) 


actually consists of three equations (in three-dimensional space), since each 
component of the resultant vector C must be the sum of the corresponding 
components of vectors A and B : 

C \ — A x + B x , 

Cy — Ay + By, (5.7 ) 

C z — A z + B z . 

Higher-order tensors can be added using the same process, provided that the 
tensors to be added have the same structure (that is, they are the same order and 
have the same number of covariant indices and the same number of contravari- 
ant indices). The result of tensor addition is also a tensor, and the resultant 
tensor has the same structure as each of the tensors that are added: 


Cij — Ajj + Bjj, 

C ij _ ,.\'J + (5.8) 


Note that each of these expressions represents more than one equation; the 
exact number depends on the number of values that each index may take on. 
Note also that you can add tensors with any number of covariant and con- 
travariant indices, as long as the tensors being added have the same number of 
each type of index. 

To see that the result of adding two tensors fits the definition of a tensor, con- 
sider how the tensor components A‘- and B'- transform to another coordinate 
system: 


A 


’k 

1 


B 


’k 

/ 


dx k dxJ , 

77 A',, 

dx l dx 1 1 
dx k dx-i : 

7 17 ^ ; ■ 

dx 1 dx 1 J 


(5.9) 


A k , + B 


’k 

l 


dx' k dx-* : dx' k dx ' • 

r- tt A', H r- —rr B ■ 

dx 1 dx 1 1 dx' dx 1 1 


dx' k dx> 
dx 1 dx' 1 


(A) + B)). 


Hence 
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If you compare this last expression to the expression for the transformation of 
the tensor components C'. to the primed coordinate system 

r 'k dx ' k dx ' r i 

1 dx' dx' 1 

you’ll see that the addition of A', and B‘. does produce an object C' that meets 
the transformation requirements for a tensor. 

Subtraction of tensors is equally straightforward; you simply subtract the 
corresponding components rather than adding them: 

Cij = Ajj — Bjj, 

C ij = A ij -B ij , (5.10) 

and the result of tensor subtraction is also a tensor, as you can see in the 
problems at the end of this chapter. 


5.4 Tensor multiplication 

As described in Chapter 2, there are several different ways to multiply vectors - 
the scalar (dot) product and vector (cross) product both take two vectors as 
inputs and produce a result that depends on the magnitudes and directions of 
those two vectors. Not mentioned in that chapter was another form of vector 
product called the “outer” product between a column vector (A) and a row 
vector ( B ), which operates like this: 

/ At \ / A[B\ A\B 2 A\B 3 \ 

A ® B = I A 2 I ( B\B 2 B 3 ) = I A 2 B\ A 2 B 2 A 2 B 3 I . 

V A 3 / v A 2 B\ a 3 b 2 a 3 b 3 ) 

Note that the outer product of two rank- 1 tensors (vectors) is a rank-2 tensor, 

formed simply by multiplying the individual components of the two vectors. 
The outer product is indicated with the <g> symbol in some texts; others just 
write the two vectors or tensors next to one another, such as A' B ' = C lJ . 

The outer-product operation may also be performed on higher-order tensors: 

4 ' _ pik 

A j a lm ~ L jlm ' 

In this case, the outer product of a rank-2 tensor and a rank-3 tensor is a rank-5 
tensor. This illustrates the fact that the covariant rank of the outer-product ten- 
sor is the sum of the covariant ranks of the input tensors, and the contravariant 
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rank of the outer-product tensor is the sum of the contravariant ranks of the 
input tensors. 

The result of the outer-product operation is easily shown to be a tensor by 
considering how tensors A, B, and C transform from the unprimed to the 
primed coordinate system. The transform of tensors A and B is given by 

A 'n = dx '” dxJ A i 
0 dx‘ dx'° j ’ 

' p dx' p dx 1 dx m k 

Bqr = ~d^d7^^ B,m ' 

Multiplying these expressions gives 


n u P 
1 o D qr 


dx" dxJ ,• dx p dx 1 dx" 


So if A’B* m =Cf lm 


r "P — 

^ oqr — 


dx 1 

dx'° 

2 dx k dx'l dx' r 

dx' n 

dx 1 

dx'P dx 1 dx'" ■ 

dx 1 

dx'° 

dx k dx'l 3x' r 1 

'n n 'p 
o D qr 

= c 

oqr, then 

dx' n 

dx 2 

dx p dx 1 dx m ^ik 

dx 1 

dx'° 

dx k dx'l 9x' r kl> 


B 


(5.11) 


and the result of the outer product operation does indeed meet the transforma- 
tion requirements for a tensor. 

Another way to multiply tensors is called the “inner product,” which you can 
think of as a generalization of the scalar or dot product discussed in Section 2.1. 
As described in that section, the dot product between two vectors produces a 
scalar result, so you might expect the inner product between two tensors to 
produce a tensor of lower rank. That’s exactly right, but to understand how it 
happens, you first need to understand the process of tensor contraction. 

To contract a tensor, simply set one contravariant index equal to a covariant 
index (or vice versa) and then sum over the repeated index. This leads to a 
tensor with a rank that is two less than the rank of the tensor with which you 
started. 

To see how this works in practice, consider the rank-4 tensor C kl . To contract 
this tensor in the second and third indices, set the index k equal to the index j, 
resulting in 

r'j — c n 4- c' 2 c' 2, — n' 

c jl — L 1/ + l 2/ + c 3/ — u l ’ 


assuming that the indices j and k run from 1 to 3. Note that the rank is reduced 
by two because you made one index the same as another (reducing the rank 
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by one) and then you summed over that index (reducing the rank by one 
more). Note also that contraction produces another tensor only when the two 
indices that are made equal are in different positions (one superscript and one 
subscript). 

The reason for this becomes clear if you consider the contraction of the 
tensor that resulted from the outer-product operation in Eq. 5.11. Contracting 
this tensor in the first and fourth indices by setting q equal to n gives 


dx n dx-i dx p dx l dx m ^ jk 
~ dx 1 dx'° dx k dx'" dx' r jlm 
dx " dx 1 dx-i dx p dx" 1 ^ik 
dx 1 dx' n dx'° dx k dx' r klm 
dx 1 dx-i dx p dx m jk 
dx 1 dx!° dx k dx' r klm 


O / 

But the derivative ' T involves only coordinates in the same (unprimed) sys- 
tem, and coordinates within the same system must be independent of one 
another. Hence this derivative must equal zero unless l = i, in which case 
it must equal one. This is most easily expressed using the Kronecker Delta 
function, defined by 


Thus 


r n P 
^ onr 


_ ■ dx j dx'P dx" 1 ik 
1 dx° dx k dx r jlm 
d x j dx'p dx m ik 
“97° ~dx k dx 7 " jim ’ 


which is a tensor of rank 3, as expected. But note that this reduction from 5 
to 3 in rank required that two of the partial derivatives combine to produce 
the delta function, which then invoked the summation process. That derivative 
combination only works if one of the contracted indices is a superscript and 
the other a subscript. 

In this last example, the contraction was performed on a tensor that was the 
result of an outer product. That two-step process (outer-product multiplication 
followed by contraction) is called the “inner product” of two tensors. So if you 
start with two vectors (tensors of rank 1), form their outer product (producing 
a tensor of rank 2), and then contract the result, you end up with a tensor 
of rank zero - a scalar. This illustrates why the inner-product process can be 
considered to be a generalization of the dot product between two vectors. 
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5.5 Metric tensor 


As you think about contravariant and covariant components of vectors and 
tensors, you should not lose sight of the fact that these components exist only 
when you’ve selected a coordinate system. And why do you need a coordinate 
system? Because coordinate systems “arithmetize” space - that is, they give 
you a way of applying the rules of arithmetic to objects that exist in the space 
in which you’re working. That space may be the three-dimension space of 
everyday experience, or the four-dimension spacetime of Einstein, or any other 
space you can imagine. The coordinate system you apply may have straight 
axes that intersect at right angles, or the axes may be curved and intersect at 
any angle of your choosing. 

However you choose to arithmetize a space, there is one tensor that allows 
you to define fundamental quantities such as lengths and angles in a consistent 
manner at different locations. That tensor, the one that “provides the metric” 
for a given coordinate system in the space of interest, is called the fundamental 
or metric tensor. The lower-case letter “g" has become the standard symbol for 
the metric tensor, which you may see written as g or g. The metric tensor has 
contravariant components g lJ and covariant components gjj. 

To understand the role of the metric tensor, consider two points separated 
by an infinitesimal distance ds. If the vector dr extends from one point to 
the other, then the square of the differential length element may be written as 
ds 2 — dr o dr. The vector dr may be written using contravariant components 
and coordinate basis vectors (?, ) as 

dr = ejdx 1 , 

or using covariant components and dual basis vectors (e, ) as 

dr = e‘ dxj. 

Since ds 2 involves the dot product of dr with itself, you have the option of 
using the contravariant components dx 1 on both sides of the dot: 

ds 2 = dr o dr = ejdx' o ejdx 2 
= (jet o ej)dx l dx J 
— gijdx' dx J , 

where g,j represents the covariant components of the metric tensor. Alter- 
natively, you may use the covariant components dxj on both sides of 
the dot: 
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= (e 1 o e J )dxidxj 
— g ,J dxjdxj, 


where g ,J represents contravariant components of the metric tensor. A third 
option is to use contravariant components on one side of the dot and covariant 
components on the other: 


ds 2 = ejdx 1 o e-i dxj 
= (?; o e J )dx'dxj 
= dx'dxj. 


Note that in this case no metric tensor is needed, since the definition of dual 
basis vectors ensures that e; o e J equals one if i = j and zero if i ^ j . 

Whether ds 2 is written as gjjdx'dx ' , g 1 ' dxjdxj, or dx'dxj, you can be 
sure of one thing: the distance between two points must be the same no matter 
which coordinate system you employ, whether you use contravariant, covari- 
ant, or mixed components. Hence it must be the job of the metric tensor g 
and its components g'j and gjj to turn the product of incremental coordinate 
changes expressed in either contravariant or covariant components into the 
invariant distance between points. This is the rationale behind the statement 
that the metric tensor “provides the geometry” of the space. 

The geometry of vectors entails use of lengths and angles, so it’s useful 
to understand the role of the metric tensor in defining the length of a vector 
such as A and the angle between two vectors A and B. Just as the incremental 
distance ds can be found by dotting the separation vector dr into itself, the 
length of vector A can be found from A o A. And there’s more than one way 
to do that. 

One option is to use only the contravariant components of A : 



Another option is to use only covariant components: 



And the final option is to use mixed components: 
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= V A O A = yj 


A‘ei o AjeJ 


= J(ei o e J )A‘ Aj = 



As in the case of dr, the metric tensor ensures that the length of vector A is 
invariant. 

To understand the role of the metric tensor in providing a consistent defini- 
tion of angles, consider the dot product AoB. Once again, there are alternative 
ways of writing this product, and this means that the angle between A and B 
can be written in the following equivalent ways: 

AoB 
cos 6 = — — — 

|A||B| 

= gijA'BJ 

\j gijA'A-iyJ gijB' BJ 

A{ B< 

y/ AjA'^J BjB* 

= gVAjBj 

yjg'j A i A j 'j Bi Bj 


This explains why you’re likely to run into the statement that the metric tensor 
“provides a dot product” for a space - if you know how to find the dot product, 
you can define lengths and angles. 

To see the tensor nature of the metric tensor, consider the transformation of 
the contravariant components of the incremental separation vector dr: 


/• dx 

dx = rdx J . 

dxJ 


This means that the square of the incremental length ( ds ~ ) becomes: 


ds 2 = 


dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 
dx 1 S.r 1 dx 1 dx 1 dx 1 dx 1 


dx l dx l 


+ 


dx' 1 dx 1 dx' 2 dx' 2 9x 3 dx' 2 
dx 2 dx 2 

dx 1 dx 1 


dx 2 dx 2 dx 2 dx 2 

dx 2 dx 2 dx 3 dx 3 

9x 3 9x 3 9x 3 9x 3 + dx 3 dx 3 

dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

9.x 1 dx 2 dx 1 dx 2 dx 1 dx 2 


dx 2 dx 2 


dx 3 dx 2 


dx l dx 2 
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+ 


+ 


+ 


+ 


+ 


dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 


dx 2 

9x' 

+ 

dx 2 

9x' 

+ 

9x 2 

9X 1 

dx' 1 

9x 1 


dx' 2 

9x' 2 


9x' 3 

9x' 3 

dx 1 

9x 3 


9.x 1 

9x 3 


dx 1 

9x 3 

9x 1 

9x' J 


9x' 2 

9x' 2 


9x' 3 

9x' 3 

9x 3 

9x* 


9x 3 

9x' 


9x 3 

9x ! 

9x 1 

9x 1 


dx' 2 

9.x' 2 


9x 3 

9x 3 

9x 2 

9x 3 


dx 2 

9x 3 


9x 2 

9x 3 

dx 1 

9x 1 

+ 

dx' 2 

dx' 2 

+ 

9x' 3 

9x' 3 


9x 3 dx 2 9x 3 dx 2 dx 3 9x 2 


dx 2 dx 1 
dx l dx 3 
dx 2 dx l 
dx 2 dx 3 
dx 2 dx 2 . 


(5.12) 


This daunting expression becomes far more tractable if you realize that each 
bracketed term involves the sum of the partial derivatives of each of the trans- 
formed coordinates (x , x , and x ) taken with respect to two of the original 
coordinates (x 1 , x 2 , and x 3 ). More specifically, each of the three terms within 
each bracket is a product of the components of the basis vectors tangent to the 

o r \ o '2 o '3 

original axes (recall that and ^4- are the components in the trans- 

formed coordinate system of the basis vector tangent to the z'th original axis). 

If you assign the bracketed terms to the variable g with two subscripts 
denoting the axes with respect to which the derivatives are taken, you will 
have 

£11 = 

822 = 

£33 = 

£12 = 

£13 = 

£23 = 


dx 1 dx 1 9x 2 dx 2 dx 3 dx 3 

dx 1 dx 1 + 9x' 9x 3 + dx 1 9x* 

9x ; ' 9x^ 9x 2 dx' 2 9x 3 9x* 3 

9x 2 9x 2 9x 2 9x 2 9x 2 dx 2 

dx 1 9x 1 9x 2 9x 2 9x 3 9x 3 

9x 3 9x 3 9x 3 9x 3 9x 3 dx 3 

9x 3 9x^ 9x 2 9x 2 9x 3 9x* 3 

9x ! 9x 2 9x 1 9x 2 9x 3 9x 2 

9x 1 9x 1 dx 2 dx 2 dx 3 dx 3 

9x' 9x 3 dx 1 9x 3 9x 3 9x 3 

dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

dx 2 9x 3 dx 2 9x 3 9x 2 9x 3 
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and since the order of multiplication is irrelevant, gs i = gn, £31 = g 1 3 , and 
g32 = 823 ■ Substituting these into Eq. 5.12, the expression for els 2 becomes 

i9 11 9i9 t o 1 9 91 

ds = gndx dx + g2idx dx~ + g^^dx dx + gudx dx“ + g2\dx dx 
+ gi3dx l dx 3 + g3]dx 2 dx l + gzidx 2 dx 2 + g32dx 2 dx 2 . 


This can be further simplified using index notation and the summation 
convention: 

ds 2 = gijdx'dx 2 . (5.13) 


The gjj term in this equation meets all the requirements of a second-rank 
tensor, but it’s not just any tensor. Because it relates the coordinate differentials 
in various directions to a quantity that is invariant across all coordinate trans- 
formations, it’s no wonder that this tensor is called the metric or fundamental 
tensor. 

To understand what’s so fundamental about this tensor, recall that the partial 
derivatives that make up the elements of gjj also represent the components of 
the basis vectors tangent to the original coordinate axes: 


e\ 


e 2 


e 3 


dx 1 

dx' 2 

dx ' 3 \ 

3x ! 

dx 1 

3x* J 

3x 1 

dx 2 

dx' 3 \ 

3x 2 ’ 

dx 2 ’ 

dx 2 J 

dx 1 

dx 2 

dx' 3 \ 

9x 3 ’ 

dx 3 ’ 

dx 3 / 


And since 


dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 
dx‘ dxJ dx 1 dxJ dx 1 dx-i 


(5.14) 


(5.15) 


another way to represent the metric tensor is gjj = e, o e ; - (the inner product 
of the basis vectors tangent to the coordinate axes). Since the inner product 
involves the projection of one vector onto the direction of another and scales 
as the length of those two vectors, the elements of g,j specify the relationships 
between the coordinate axes. Those relationships are determined by the shape 
of the coordinate space. 

The nature of the metric tensor can be readily understood by considering a 
transformation from spherical polar (r. 9, <p) to Cartesian (x, y, z) coordinates. 
In this case 
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/ 1 t 1*2 3 
x = x — rsin(6)cos(4 > ) = x sin(x )cos(x ), 

x 2 = y — rsin{9)sin((p ) = x l sin(x 2 )sin(x 3 ), (5.16) 

x 3 = z = rcos(6) — x 3 cos(x 2 ). 


and the partial derivatives appearing in the elements of the metric tensor are 


and 


dx 1 
dx 1 
dx 1 
dx 2 
dx' 2 
dx 1 
dx' 2 
dx 2 
dx 3 
dx 1 
dx 3 
dx 2 


sin(x 2 )cos(x 3 ) = sin(9)cos(cp), 
12 3 

x cos(x )cos(x ) = rcos(d)cos(cf>), 
sin(x 2 )sin(x 3 ) — sin{0)sin((j)), 
x 3 cos (x 2 )s i n(x 3 ) = rcos(6)sin{<p), 
cos(x 2 ) = cos (6), 

— x 3 sin(x 2 ) = — rsin(6 ), 


dx 1 
dx 3 


= —x 1 sin(x 2 )sin(x 3 ) = — rsin(9)sin {((> ), 


'2 


dx 
dx 3 
dx 3 
dx 3 


= x 1 sin(x 2 )cos(x 3 ) = rsin(0)cos((p). 


= 0 . 


Inserting these values into the expression for gjj (Eq. 5.15) gives the 
diagonal terms: ' 


£11 


8 22 


g 33 


dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

dx 1 dx 1 dx 1 3jc 1 + dx 1 dx 1 

3x 1 3x 1 3x 2 dx 2 dx 3 dx 3 

dx 2 dx 2 dx 2 dx 2 dx 2 dx 2 

dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

dx 3 dx 3 dx 3 dx 3 dx 3 dx 3 


3 If you don’t see how to get these results, you can find more detail in the problems at the end of 
this chapter and in the on-line solutions. 
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The off-diagonal terms are 

= 0 , 

= 0 , 

= 0 . 

Thus the metric tensor for spherical polar coordinates is 


~ gll 

g 12 

<?13 ' 


' 1 

0 

0 

821 

822 

<?23 

= 

0 

r 2 

0 

_ £31 

832 

<?33 _ 


_ 0 

0 

r 2 sin 2 (0) 


A careful look at the metric tensor can tell you something about the coordinate 
system you’re dealing with. For example, the fact that all off-diagonal elements 
are zero in this case tells you that spherical polar coordinate axes, while curved, 
are orthogonal (that is, the lines of increasing r, 0, and </> intersect at right 
angles). Furthermore, by inserting these values into Eq. 5.13, you’ll have 

ds 2 = dr 2 + r 2 d0 2 + r 2 sin 2 0d(p 2 . (5.18) 


dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

3.V 1 dx 2 dx 1 dx 2 dx 1 dx 2 

dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

dx 3 dx 3 3* 1 dx 3 dx 1 dx 3 

dx 1 dx 1 dx 2 dx 2 dx 3 dx 3 

dx 2 dx 3 dx 2 dx 3 dx 2 dx 3 


This expression makes it clear that the elements of the metric tensor tell you 
how to turn an incremental change in r, 6, or cp into a change in distance. For 
example, the factor of one in front of the dr 2 term means that a change in r 
is already a distance. But a change in zenith angle ( 6 ) must be multiplied by a 
factor of r to turn it into a distance. And the distance corresponding to a change 
in the azimuthal angle </> depends on both the zenith angle (hence the sin(0) 
term in £ 33 ) as well as the distance from the origin (hence the r term in # 33 ). 

Other coordinate systems require other factors to convert each change in a 
coordinate value to a distance, and those factors always appear in the metric 
tensor for that system. For orthogonal coordinate systems, the square roots of 
the diagonal elements of the metric tensor (^/gTT, *Jg 22 , and *Jgy}) are called 
the “scale factors” (hi. In, and / 13 ) of the coordinate system. Thus the scale 
factors for spherical polar coordinates are h\ = ^/gn = I , In = *Jg22 — r , 
and h 3 = ^/gyi — r sin$. 

Once you’re familiar with the metric tensor and scale factors, you can easily 
find the differential operators gradient, divergence, curl, and Laplacian in any 
orthogonal coordinate system (curvilinear or rectangular). For example, the 
gradient is given by 


5.6 Index raising and lowering 
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1 d(p „ 1 30 „ 1 30 „ 

V0 = + V 2 d^ 62 + h~ 3 d^^ 


h i 3x 

and the divergence may be written as 
1 


Vo A = 


hih 2 h 3 
The curl is given by 


3 9 9 

—r(h 2 h 3 A l ) + — — 5- (/i 1 /?3 A 2 ) + —^{h l h 2 A 2 ) 
dx 1 dx z dx J 


V x A = 


1 


h[h 2 h 2 


hie 1 
3 

3 a' 1 


h 2 e 2 

3 

3a- 


h 3 e 3 

3 

3a 3 


/? 1 A 1 h 2 A 2 h 3 A 3 


which expands to 
1 

V x A = 


/ 3 /z 3 A 3 

dh 2 A 2 \ 

A d-* 2 

3x 3 / 


9 /? 1 A 1 


3/73A3 

3x 3 Sx 1 

The Laplacian can be found as 

1 


V 2 0 = 


/? 1/72/73 


3 fh 2 h 2 3 <p 
Sx 1 \ /ji Sx 1 


3 

3x 2 


hie i 



dh 2 A 2 

dx l 

'hih 3 

30\ 

. hi 

3x 2 / 


3/z 1 A x 
3x 2 


/7 3 e 3 


+ 


3x 3 


h[h 2 d(p 
h 2 3x 3 


If you’d like to see some examples of how these expressions can be used, check 
out the problems at the end of this chapter and the on-line solutions. 4 


5.6 Index raising and lowering 

One of the many useful functions of the metric tensor is to convert between the 
covariant and contravariant components of other tensors. Imagine that you’re 
given the contravariant components and original basis vectors of a tensor and 
you wish to determine the covariant components. One approach is to use the 
techniques described in Chapter 4 (finding the dual basis vectors, performing 
parallel and perpendicular projections, and the like), but with the metric tensor, 
you have another option. You can use relations such as 

gijA j = A, (5.19) 

to convert a contravariant index to a covariant one (thus “lowering” an index). 
Furthermore, if you wish to convert a covariant index to a contravariant index, 

4 You can find the derivation of these extremely handy equations in Boas’ Mathematical 
Methods in the Physical Sciences, John Wiley and Sons, 2006. 
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you can use the inverse of g,j (which is just g lJ ) to perform operations like 
this: 

g i 'B i =B’. ( 5 . 20 ) 


And this same process works for higher-order tensors: 

8 U A ik = 4, 

Cjk = SjsC^ 


J'ijk _ gil y ' jk 


( 5 . 21 ) 
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In many applications, it’s important to know how a vector field changes as 
you move from one location to another. For vectors expressed using Cartesian 
coordinates, taking the derivative of a vector is quite straightforward: you sim- 
ply take the derivative of each of the vector’s components. You can do that 
because the Cartesian basis vectors (i, j, and k) are everywhere constant in 
both magnitude and direction. That means you don’t need to worry about the 
derivatives of the basis vectors. But as you’ve seen for spherical polar coordi- 
nates, the basis vectors (r, 6, and <f>) point in different directions as you move 
around the space, which means that when you take a spatial derivative of a 
vector expressed in these coordinates, you must also consider the derivatives 
of the basis vectors. 

Thus if you have a vector A expressed in general coordinates x 1 , x 2 , x 3 with 
covariant basis vectors e\, e 2 , and e* 3 as 

A = A l e\ + A 2 e 2 + A 3 e 3 , 


the derivative of A with respect to coordinate x 1 is 

dA d(A x ii + A 2 e 2 + A 3 e 3 ) 


dx l 


dx 1 


d(A'ej) 

3.x 1 


3 A 1 '. -3 e t 

= — T e\ + A 1 — r. 
dx 1 dx 1 

It’s the second term in this equation that complicates the process of taking a 
derivative in coordinate systems in which the magnitude and/or direction of 
the basis vectors change as you move around the space. And as you might 
expect, similar terms appear when you take the derivatives of A with respect 
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to the other coordinates. So if you want to evaluate the changes in vector fields 
expressed in non-orthonormal coordinates, you have to account for possible 
changes in the basis vectors. Properly accounting for those changes means that 
the result of the defferentiation process will retain the tensor characteristics of 
the original object. 

Fortunately, there’s a way to account for any change in the basis vectors and 
to ensure that the derivative of a tensor is another tensor. That process, called 
the “covariant derivative,” is described in the next section of this chapter. But 
the process of covariant differentiation will make a lot more sense to you if 
you’ve first learned the meaning of the Christoffel symbols described in this 
section. 

To understand Christoffel symbols, you should begin by realizing that the 
derivative of a basis vector will be another vector. Like any vector, that vector 
can be described as the weighted combination of the basis vectors at the point 
under consideration. Each Christoffel symbol, written as an uppercase Greek 
gamma (T), simply represents the weighting coefficient for one of the basis 
vectors. Hence the defining relationship for Christoffel symbols^ is 


r ij* k 


d <?; 
dxJ ’ 


(5.22) 


in which the index i specifies the basis vector for which the derivative is being 
taken, the index j denotes the coordinate being varied to induce this change 
in the ith basis vector, and the index k identifies the direction in which this 
component of the derivative points, as shown in Figure 5.1. 


This Christoffel symbol gives 
you the magnitude of one 
component of the 
derivative vector 



Tells you which basis 
vector’ s change is 
being considered 


Tells you which basis vector 
points in the direction of this 
component of the derivative 
vector 


Tells you which coordinate 
is being varied to cause a 
change in the basis vector 


Figure 5.1 Explanation of Christoffel symbol indices. 


5 The Christoffel symbols written as r(. are Christoffel symbols of the second kind; another 
form of Christoffel symbol (the “first kind”) is described in most General Relativity texts. 
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direction 


caused by a 
change in 9 


caused by a 
change in 9 


Figure 5.2 Example of Christoffel symbol indices. 

Hence if you find two Christoffel symbols such as r r r() — 0 and T e rg = y, 
you know that 


ue r _ i _> 

— = 0e r + -e e , 
69 r 


which is further explained in Figure 5.2. 

As this example illustrates, Christoffel symbols are really quite simple to 
understand once you know the code of their indices. Best of all, the values of 
these useful symbols are easy to determine if you know the elements of the 
metric tensor for the coordinate system in which you’re working. It will take 
a bit of algebra to get to the relationship between Christoffel symbols and the 
metric tensor, but the result makes the trip worthwhile. 

A good way to start is to form the dot product of the basis vector e 1 with 
both sides of Eq. 5.22: 



Remembering that ek o e l = 8 l k , this becomes 



Since the term is the same as t-j , this may be written as 

dx J dx l J 
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which seems rather pointless until you add nothing to it. Nothing, that is, in the 
following form: 


r' : = -e‘ o 


de,- 

dxk 


1 k i de k _ 

-g r o e; 

2 5 dxJ 


!-./ dej 

-e o — v 
2 3jc' 


1 


Ml de k 


—g r o e j 

2 s dx l 1 


1 kl de j - 

2 g d^ oe ‘ 

1 w - 
“ 2 * dx*° e - 


Note that the terms in parentheses on each line add to zero, so you haven’t 
changed the quantity on the right side of the equation by adding these terms. 
It may look like things are getting worse, but the situation will become more 
clear once you’ve accomplished a few more bits of manipulation. The first bit 
is to realize that e 1 — g kI e k , so the Christoffel symbol becomes 


+ \i u h = 0 


1 ld d ^k 


1 




dej 

dx k 


oe. 


1 k i de k 


1 k/ d et 


+ h s av oe '-2* ■ 


Now it’s just a matter pulling out the common factor of \g kl and grouping the 
terms by their sign: 


T , = 


1 


;g 


de i 


e k o 


+ 


de k 


dx J dxJ 

j - . de t 


7 ° e i + \e k o 


de 


J + de k 


dx 1 dx 


o e i 


de 


which may be further simplified if you recognize that 


e k ° 


dei de k 


e k o 


dxk 
dej_ 
dx 1 
de j 


So 


e; o 


r? . = - g kl 

‘j 2 H 


dx' 
dek 
dx 1 
de i 


oe: 


d(e k o ei) 

dx-i 

d (e / oe k ) 
dx' 

d(e t o ej) 


dx k dx k 1 dx k 

d(e k o ej) d (e j o e k ) 9(e/ o ej) 


dxk 


dx' 


dx k 


But you know from the definition of the elements of the metric tensor that 
ei oe k = gj k and that ei oej = gjj , which means you can write 


r l - = - g kl 

•J 2° 


dgik dgjk_ _ dgjj 
dxk Dx' dx k 


(5.23) 
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With this expression, finding the Christoffel symbols for any coordinate sys- 
tem for which you know the metric tensor is quite straightforward. And why 
is that worth doing? Simply because using the Christoffel symbols, you can 
take a derivative of vectors and tensors that accounts for changes in the basis 
vectors as well as changes in the components. This preserves the most impor- 
tant property of a tensor: invariance across coordinate systems. Such covariant 
derivatives are the subject of the next section, but before getting to that, you 
might want to consider an example of the Christoffel symbols for a familiar 
coordinate system. 

Consider the cylindrical coordinates (r, </>, and z) described in Section 1.5. 
In this system, the square of the differential length element is related to the 
coordinate differentials by ds 2 = dr 2 + r 2 dtp 2 + dz 2 . Hence the covariant 
metric tensor may be represented by 


" gll 

gl2 

g 13 


" 1 

0 

0 " 

g21 

g22 

g23 

= 

0 

r 2 

0 

_ g31 

g32 

g33 _ 


_ 0 

0 

1 _ 


which suggests that most of the Christoffel symbols will be zero in this case. 
You can verify that by taking the derivatives indicated in Eq. 5.23, beginning 
with / = l,i = 1, and j = 1 (and don’t forget that the summation convention 
means that you must sum over k ): 


r ” = -j 


it 


,21 


+ 2* 


31 


9gn 9gn 

fix 1 3 jc 1 
dg 12 dg 12 

fix 1 dx l 
dg 13 , dg 13 


fix 1 


fix 1 


dgn ' 
fix 1 
_ dgn 
fix 2 
_ 9gn 
fix 3 


and then using the relations x 1 = r, x 2 = </>, and x 3 = z. 


r i = =(i) 


n ( °) 
+ ^(0) 


9(1) ^ 

9(1) 

9(1)' 

dr 

dr 

dr 

9(0) 

, 9(0) 

9(1) 

dr 

dr 

d<p 

"9(0) 

, 9(0) 

_ 9(1) 

dr 


dz 


= 0 . 


OK, that one was pretty boring, as are most of the others in this case. But have 
a go at the Christoffel symbol for Z = 1, i = 2, and j = 2: 
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which is: 


r 1 - -V 1 
l 22~ 2 8 


21 
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r 22 = 2 (1) 
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+ 2 ( ' 
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+ 2 ( ' 


3<?21 
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dg22 

dx 2 

dx 2 

dx [ _ 

dg22 

dg22 

dg22 

_ dx 2 

dx 2 

dx 2 

dg23 

dg23 

dg22 

_ dx 2 

dx 2 

dx 3 

3(0) 

3(0) 

dr 2 

dtp + 

dtp 

dr _ 

dr 2 

dr 2 

dr 2 

_ dtp 

dtp 

dtp _ 

3(0) 

| 3(0) 

dr 2 


dtp d z 


r] 2 = — ( 1 )[0 + 0 — 2r] + 0 + 0 = 


Now you’re getting somewhere. And exactly where is that? Just remember 
the meaning of a Christoffel symbol, and you’ll see that this result means that 
the change in the covariant tp basis vector as you move in the tp direction has 
a component in the —r direction that increases directly with distance from the 
origin. 

A similar analysis shows that Tp = r|| = 1/r, which are the only other 
non-zero Christoffel symbols for the cylindrical coordinate system/ 1 If you 
don’t see how to get that result, take a look at the problems at the end of this 
chapter and the on-line solutions. 


5.8 Covariant differentiation 

With Christoffel symbols in hand, you have a way of differentiating a vector or 
higher-order tensor that includes the effect of changes (if any) in the magnitude 
and direction of the basis vectors used to expand that vector or tensor. This type 
of derivative is called the “covariant” derivative, and it finds application not 
only in the Euclidean space in which many engineering and physics problems 
are worked, but also in the curved Riemanian space of General Relativity. 

In Euclidean space, two vectors at different locations may be compared and 
combined by dragging one of the vectors to the location of the other without 

6 Note that the symmetry of the metric tensor means that Christoffel symbols of this type are 
symmetric in the two lower indices. 
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changing its magnitude or its direction. If the vector is expanded using Carte- 
sian coordinates, such “parallel transport” is accomplished simply by keeping 
each of its components the same (because the Cartesian basis vectors have the 
same magnitude and direction everywhere). But if the vector is expressed in 
non-Cartesian coordinates, the length and direction of the basis vectors may be 
different at the two locations. In such cases, the covariant derivative provides 
a means of parallel-transporting one of the vectors to the location of the other. 

The situation is more complicated for curved spaces. You can find the details 
of the use of the covariant derivative in curved spaces in Chapter 6, but for 
now you can understand the role of the covariant derivative by considering a 
two-dimensional spherical surface embedded in a three-dimensional Euclidean 
space. Imagine a series of tangent planes just touching the sphere at each loca- 
tion, and picture a vector lying in one of those tangent planes. If that vector is 
moved to a different location on the sphere while holding its direction constant 
(as viewed in the larger three-dimensional space), it will not lie in the tangent 
plane at the new location (you can think of the vector as “sticking out” of 
the two-dimensional space of the sphere). In such cases, the covariant deriva- 
tive serves to project the derivative of the vector into the tangent space of the 
sphere. 

You should also note that the covariant differentiation process produces 
a result that retains the properties of a tensor, which means that the result 
transforms between coordinate systems according to the rules of tensor 
transformation. 

To understand how the process of covariant differentiation works, consider 
the vector A = A 1 e i + A 2 e \ + A 3 e 3 and its derivatives 

3A 
dxJ 


Now replace the partial derivative in the second term with the Christoffel- 
symbol definition (Eq. 5.22): 


3(A 1 ej + A 2 e 2 + A 3 <? 3 ) 
dxJ 

dx-i 

3 A' _ , de,- 

rii + A — 

3 xJ dxJ 


3 A 
dxJ 


3 a' 


Since the indices i and k in the second term are both dummy indices by the 
summation rule, you can switch them and then extract the common factor that 
is now the basis vector e, : 
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3A 

dxJ 


dA‘ 


dx-i 

(- 

\dxJ 


S + A k (r‘ kJ S) 




The covariant derivative is defined as the combination of the two terms inside 
the parentheses. Common notation for the covariant derivative is to use a semi- 
colon (;) in front of the index with respect to which the covariant derivative is 
being taken (j in this case). Thus you’re likely to see the components of the 
covariant derivative defined as 


A' ^ 
' J dxJ 


A k r‘ kj . 


(5.24) 


A similar analysis leads to the covariant derivative of a vector expanded 
using covariant coefficients: 


OAi 

Ai;i s “ A ‘ r< * " (5 - 25) 

Note that the term involving Christoffel symbols is subtracted in this case. 

To make the meaning of Eqs. 5.24 and 5.25 more explicit, consider the 
covariant derivative of vector A with respect to <p in cylindrical coordinates 
(so x 1 = r, x 2 = <p, and x 3 = z). Setting j — 2 in Eq. 5.24 (since we’re 
interested in the covariant derivative with respect to </>), 


A r 

■ r, 


dA r 
!hp 
8A r 
3 4> 


+ a 1 rj.0 + A^r 1 ^ + A'T 

+ 0 + A^(-r) + 0, 


which says that a change in the r-component of vector A caused by a change 
in cp is caused both by a change in A r with cp and by a change in the basis 
vectors which causes a portion of A that was originally in the 0-direction to 
now point in the — r-direction. Likewise, for the change in A^ as the value of 
cp is changed, 



3 A^ 
3 (p 
3 A* 
Ikp 


+ a^ + at% + a^ 

+ A r (;) + 0 + 0. 


Thus 


3A 

dtp 





€(f> • 


The process of covariant differentiation can also be applied to higher- 
order tensors. As you might expect, this simply requires the addition of a 
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Christoffel-symbol term for each contravariant index, and the subtraction of 
a Christoffel-symbol term for each covariant index. Hence 

d A‘J 

I A C/ P‘ I A 11 \ 

Ik' 


A., = 


dx k 
dB , 


+ A'i TL + A il rl, 


Bij ’ k ~ dx k 


11 ~ BijT\ k - B„r l jk , 


3 C: 

r 1 — 1 

V- : .h — 


yA 


dx k 


c j T lk - C \ V jk- 


5.9 Vectors and one-forms 

If you look up the subject of tensors in recently published physics texts, espe- 
cially those dealing with General Relativity, you may be surprised to find little 
mention of contravariant and covariant components in favor of terms such as 
“covectors” and “one-forms.” Have you wasted your time struggling to under- 
stand complicated concepts and terminology that have now become obsolete? 
I obviously don’t think so, or I wouldn’t have devoted so many pages to the 
developments of the last two chapters. Instead, I believe there’s value in seeing 
the “traditional” presentation as well as the “modern” approach, because the 
differences arise from perspective rather than from the core concepts. But those 
different perspectives do lead to very different terminology, and the purpose of 
this section is to provide a short introduction to that terminology. 

The first thing to understand is that the traditional approach tends to treat 
contravariant and covariant components as representations of the same object, 
whereas in the modern approach objects are classified either as “vectors” or 
as “one-forms” (also called “covectors”). In the modern terminology, vectors 
transform as contravariant quantities, and one-forms transform as covariant 
quantities. Quantities with dimension of length in the numerator (such as 
velocity, with units that include “meters per”) fit naturally into the vector 
category; quantities with dimension of length in denominator (such as the gra- 
dient of a scalar field, with units that include “per meter”) fit naturally into the 
one-form category. 

In illustrations involving vectors and one-forms, vectors are represented as 
arrows and one-forms are represented as small sections of surfaces, as shown 
in Figure 5.3. As indicated in the figure, for vectors the angle of the arrow 
shows direction and the length of the arrow shows the magnitude. For one- 
forms, surfaces are aligned normal to the direction and the spacing between 
surfaces is inversely proportional to the magnitude. This means that vectors 
with greater magnitude are represented by longer arrows, while one-forms of 
greater magnitude are represented by closer spacing. 
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One-form with 
small magnitude 



Vector with 
small magnitude 



x 

Figure 5.3 Representation of vectors as arrows and one-forms as surfaces. 

As in the traditional approach, vectors (which utilize contravariant com- 
ponents) expand using original basis vectors, while one-forms (which utilize 
covariant components) expand using basis one-forms, which are equivalent to 
dual basis vectors in the traditional approach. That correspondence means that 
the product of a vector and a one-form is an invariant (a scalar), just as the 
multiplication of a contravariant and a covariant quantity produces a scalar 
without requiring the metric tensor. One very nice graphical interpretation 
of such products is that the resulting scalar is represented by the number of 
one-form surfaces through which the arrow of a vector passes. 

Authors using the modern approach often place strong emphasis on vectors 
and one-forms as operators (or rules), so you’re likely to encounter statements 
that vectors “take” one-forms and produce scalars, just as one-forms “take” 
vectors and produce scalars. Likewise, a higher-order tensor takes multiple 
vectors and/or one-forms and produces a scalar. From this perspective, the met- 
ric tensor is an operator that takes two vectors or two one-forms and produces 
their dot product, and the components of the metric tensor may be found by 
feeding it basis vectors or one-forms. 


5.10 Chapter 5 problems 

5.1 Show that the process of subtracting one tensor from another results in a 
quantity that is also a tensor. 
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5.2 Find the elements of the metric tensor for spherical coordinates by 
forming the dot products of the relevant basis vectors. 

5.3 Show how the derivatives given after Eq. 5.16 lead to the elements of the 
metric tensor for spherical polar coordinates (Eq. 5.17). 

5.4 Use the scale factors for spherical polar coordinates to verify the expres- 
sions given in Chapter 2 for the gradient, divergence, curl, and Laplacian 
in spherical coordinates. 

5.5 Show that for cylindrical coordinates (r, 4>,z) the Christoffel symbols 
Tp and F^ are equal to 1 /r. 

5.6 Find g ,J , the inverse of the spherical metric tensor gij . 

5.7 Use g 1 ' to raise the indices of the vector A; = (1, r 2 sin0 , sin 2 0). 

5.8 On the two-dimensional surface of a sphere of radius R, the square of 
the differential length element is given by ds 2 — RrdQ 2 + R 2 sin 2 6d<p 2 . 
Find the metric tensor gij and its inverse g lJ for this case. 

5.9 What are the Christoffel symbols for the 2-D spherical surface of 
Problem 5.8? 

5.10 Show that the covariant derivative of the metric tensor equals zero. 
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This chapter provides examples of how to apply the tensor concepts contained 
in Chapters 4 and 5. just as Chapter 3 provided examples of how to apply 
the vector concepts presented in Chapters 1 and 2. As in Chapter 3, the intent 
for this chapter is to include more detail about a small number of selected 
applications than can be included in the chapters in which tensor concepts are 
first presented. 

The examples in this chapter come from the fields of Mechanics, Elec- 
tromagnetics, and General Relativity. Of course, there’s no way to compre- 
hensively cover any significant portion of those fields in one chapter; these 
examples were chosen only to serve as representatives of the types of tensor 
application you’re likely to encounter in those fields. 


6.1 The inertia tensor 

A very useful way to think of mass is this: mass is the characteristic of matter 
that resists acceleration. This means that it takes a force to change the velocity 
of any object with mass. You may find it helpful to think of moment of inertia 
as the rotational analog of mass. That is, moment of inertia is the characteristic 
of matter that resists angular acceleration, so it takes a torque to change the 
angular velocity of an object. 

Many students find that rotational motion is easier to understand by keeping 
the relationships between translational and rotational quantities in mind. So 
where translational motion dealt with position (x), velocity (v), and accelera- 
tion (a), rotational motion has the analogous quantities of angle id), angular 
velocity (o>), and angular acceleration (a). There are rotational analogs for 
many other quantities; the translational quantities of force ( F ), mass (m), and 
momentum ip) have the rotational equivalents of torque (r), moment of inertia 
(/), and angular momentum (L). 
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As you may also recall, several of the equations relating various translational 
quantities have direct parallels in rotational motion. So the rotational equiva- 
lent of Newton’s Second Law (F — ma) is r = la} And whereas translational 
momentum is related to mass and velocity by p — mv, you probably learned 
that angular momentum is related to moment of inertia and angular velocity by 

L z = I CD. 

When first presenting these relationships, most texts restrict the motion to 
planar rotation of a single particle to keep things simple. So when you think of 
the relationship between linear and angular velocity, you may think of some- 
thing like v = cor. And if L- = mvr, then L z = mr 2 co. Taking mr 2 as the 
moment of inertia (/) of a single particle, this becomes L z = Ico. But the v 
and the w in those equations can’t really be velocities, since they’re written 
as scalars rather than vectors, and that z subscript on the angular momentum 
seems to be trying to tell you something. 

It is. It’s telling you that you’re using an equation for one component of the 
angular momentum (the z-component in this case), and this pertains to a single 
particle moving about the origin in the xy plane. So these equations aren’t 
wrong, they just have limited application. Specifically, they apply to cases of 
planar motion about the z-axis. 

The more-general relationship between the vectors that represent velocity, 
angular velocity, and position is this: 

v = cb x r, (6.1) 

in which the cross represents the vector cross product described in Chapter 2. 
And the equations relating angular momentum to linear momentum, linear 
velocity, and mass are 

L = r x p 

= r x (mv) (6.2) 

= mr x v. 

Before delving more deeply into these equations, you should consider the 
implications of the (planar-motion) equation that says that the moment of 
inertia of a single particle is I particle = mr 2 . One important idea in this 
equation is that the moment of inertia of a particle depends not only on its 
mass, but also on the location of that mass - specifically, the distance (r) of 
the mass from the axis of rotation. Thus the moment of inertia of an extended 
object made up of many particles must depend not only on the object’s mass, 

1 Or, if you prefer the more-general form of Newton’s Second Law (F = ), the analogous 

rotational relationship is f = Q. 


6.1 The inertia tensor 


161 


but on the distribution of that mass. That’s true in the case of general motion 
as well as planar rotation. 

If you think of the rotational analog to the translational equation p = mv, 
you may be tempted to write an equation such as L = lib. But that equation 
would indicate that the angular momentum L must be in the same direction as 
the angular velocity <b, since multiplication by a scalar can change the length 
but not the direction of a vector (unless the scalar is negative, in which case 
the direction of the vector is reversed). For general motion, the situation is 
more complex, as you can see by applying Eq. 6.2 to a single particle cir- 
cling about the axis shown in Figure 6.1. In this figure, the particle “m” is 
circling around the z-axis, so the angular velocity (a>) points straight up, paral- 
lel to the z-axis. In this view, you’re looking down the x-axis toward the origin 
of the coordinate system, which is well below the plane of the particle’s path. 
The particle is initially at the position shown on the left side of the figure, 
and its velocity vector is coming out of the page. Since the vector angular 
momentum is given by L = mr x v, you can find the direction of the angular 
momentum at this initial instant by using your right hand to form the cross 
product between r and v, as described in Section 2.2. If you do this properly, 
you should see that L initially points up and to the right, as shown by 
in the figure. At a later time, after the particle has completed one-half revolu- 
tion about the z-axis, its velocity vector is into the page, as shown in the right 




© 



Velocity v initial 
is out of page 


/ 

/ 

/ -> 

/ r later 


Velocity vi ater 
is into page 
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Figure 6.1 Single point mass moving around an axis. 
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portion of the figure. At that later instant, the cross product between r and v 
means that the direction of the angular momentum vector L is up and to the 
left, as shown by Li ater . 

So not only is the angular-momentum vector L not parallel to the angular- 
velocity vector co, the direction of the L is changing as the particle moves 
around the axis, while the direction of co remains fixed along the z-axis. 

Under these circumstances, you clearly cannot use a scalar value for the 
moment of inertia to relate the angular momentum to the angular velocity 
through an equation such as L — l To. A scalar moment of inertia simply isn’t 
capable of relating a vector in one direction to a different vector in another 
direction. But if you’ve followed the developments of Chapters 4 and 5, you’re 
already familiar with a type of object that is capable of taking in a vector (such 
as a>) and producing another vector (such as L) that points in a different direc- 
tion. That object is a tensor. So although you may have initially learned about 
the moment of inertia as a scalar value in the case of planar motion about 
the origin, you should now understand why more-general problems require a 
more-powerful approach, and that involves the representation of inertia as a 
tensor rather than a scalar. 

You may be thinking that simply by adding another particle of equal mass 
at the same distance on the other side of the z-axis, you could produce an 
additional bit of angular momentum that would add to the angular momentum 
of the original mass. In that case, the total angular momentum would indeed 
point straight up the z-axis, in exactly the same direction as the angular veloc- 
ity. So you may suspect that the relationship between the angular momentum 
and the angular velocity (and hence the nature of the inertia tensor) depends on 
the symmetry of the object. That suspicion is correct, as you’ll see when you 
examine the components of the inertia tensor. 

You can begin to understand the components of the inertia tensor by first 
writing the tensor equation relating angular momentum to angular velocity: 

L = /«, (6.3) 

and then using the definition of angular momentum: 

L — r x p 
= r x ( mv ) 

= mr x v 
— mr x (a> x r). 

The triple vector product in this expression can be simplified using the “BAC 
minus CAB” rule described in Section 2.4, giving 
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L — m[cb(r or) — 7(7 o &>)]. 

This is a usable expression for the angular momentum of a single particle, 
and you can modify it for use with multiple masses simply by summing (or for 
a continuous object by integrating) over all the masses. Thus the expression 
you’ll most often encounter will probably look something like this: 

L = ^2 mi[to(fi o ?i) - r t (fi o «)], (6.4) 

i 

where the index i denotes each element of mass of the object. 

To see the moment of inertia in this expression, first expand the position 
vector as r; = x,-( + yij + Zik and the angular velocity vector as co = co x i + 
coyj + co z k (note that the angular velocity <S is the same for every mass element 
in a rigid body, so it’s not necessary to write &>,). Thus the expression for 
angular momentum is 

L — ^2 m,[&>(x,( + yij + Zik ) o (x,f + y t j + Zik) 
i 

- 7iix.ii + yij + Zik) o (a) x i + coyj + co z k)], 
and performing the dot products gives 

L = 'Yh m '[®( x | 2 + yf + _ 7i(xiO> x + yitOy + ZiC0 z )]. 

i 

Since the x -component of u> is co x and the x-component of r, is x,-, the x- 
component of the angular momentum can be written 

k x = T: mi[co x (xf + yf + zf) - x t (x,w x + yitOy + ZiC0 z )] 
i 

= ^2 mi [co x xf + co x yf + a> x zj - xfco x - xtyitOy - XiZi(o z ] 
i 

= mi[co x {yf + zf) - xiytcoy - XiZiCo z ], 
i 

The y- and ^-components come out as 

L y = ^ m, [oj y (xf + zf) - yiXi(o x - 
i 

L z = ^2 m i (xf + yf) - ZiXiM x - ZiyiCOy], 

i 

These three equations for the components of angular momentum (L) may be 
written as a single matrix equation: 


164 


Tensor applications 


( l x \ /E/ ™i(y? + zf) - E, m i x iyi - E; "wz; \ K\ 

Ly I = I - E, m i}’i x i E, m i(x} + Zj) ~ E, '»/ I I I 
L z ) V - E, m iZiXj - E; m iZi}’i Y,i m i( x i + y?)/ W 

(6.5) 

The elements of the center matrix represent the components of the inertia ten- 
sor (/ ). Note that the dimensions of each element are mass times distance 
squared (SI units of kg m 2 ), just as in the case of scalar moment of inertia. 

In some texts, you’ll find the elements of the inertia tensor written as 
something like 

lab = r/7/ (ftabl'f f"b ) , 

which are the same elements as shown in Eq. 6.5. 

The diagonal elements of the inertia tensor are called “moments of inertia” 
and the off-diagonal elements are called “products of inertia.” To understand 
the physical meaning of each of these elements, recall that the moment of 
inertia characterizes an object’s tendency to resist angular acceleration. That 
resistance depends not only on the object’s mass, but on the distribution of that 
mass relative to the axis of rotation. 

Each term I a j, tells you how much angular momentum in the a-direction 
is produced by rotation about the /;-axis. So /n = I xx tells you how much 
angular momentum the object produces in the x-direction due to rotation about 
the x-axis. And I 23 = I yz tells you how much angular momentum the object 
produces in the y-direction due to rotation about the z-axis. 

How those off-diagonal terms come about is explained below, but you 
should first take a look at the diagonal terms. In the expression for I xx , for 
each element of mass (mi), the element’s mass is multiplied by the square of 
the distance from the x-axis (y 2 + zf ). So this is just the three-dimensional ver- 
sion of the equation you may have learned for planar rotation that says that the 
moment of inertia of a particle is / = mr 2 , where r is the particle’s distance 
from the axis of rotation. Looking down the diagonal of the inertia tensor, you 
see the contribution to the x -component of angular momentum due to rotation 
about the x-axis, the contribution to the y-component of angular momentum 
due to rotation about the v-axis, and the contribution to the z-component of 
angular momentum due to rotation about the z-axis. The bottom line is that dis- 
tributions of mass that are symmetric about each axis contribute to the diagonal 
terms of the moment of inertia matrix. 

The off-diagonal elements of the inertia tensor are somewhat different. In 
hz, for each element of mass (w;), the element’s mass is multiplied by the 
product of the element’s y- and z-coordinates (y, z/). As explained above, this 
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determines the contribution to the ^-component of angular momentum due to 
rotation about the z-axis. And when does rotation about the z-axis produce a 
y-component of angular momentum? When there’s an asymmetric distribution 
of mass about the z-axis, for example as shown with the single particle in Fig- 
ure 6.1. Likewise, the I xy term determines the contribution to the x-component 
of angular momentum due to rotation about the y-axis. Such contributions 
come from mass distributions that are asymmetric about the y-axis. Hence 
distributions of mass that are asymmetric about a given axis contribute to the 
off-diagonal terms of the moment of inertia matrix. 

To see how this works, consider the five point masses on the corners and top 
of a pyramid as shown in Figure 6.2. To determine the inertia tensor for this 
configuration of masses, you simply have to plug the mass and coordinates of 
each of the masses into Equation 6.5. If the mass of each of the five masses is 
the same and equal to “ m ” and the height of the pyramid is equal to the length 
of each of the bottom sides (with a value of 2 a as shown in Figure 6.2), the I xx 
term is simply 

Ixx = m \(y\ + z\) + /n 2 (yf + z 2 ) + m 3 (y 2 + zj) + m 4 (yj + zj) 

+ m 5(y5+zl) 

= m\{a 2 + 0 2 ) + ra 2 (a 2 + 0 2 ) + m 3 [(— a) 1 + 0 2 ] + m 4 [(— a) 2 + 0 2 ] 

+ m 5 (0 2 + (2a) 2 ) 

= 8 ma 2 , 



Figure 6.2 Five point masses arrayed as a pyramid. 
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and you should obtain the same result for the other diagonal elements I yy and 
Izz- Moving on to the off-diagonal elements, the I xy term is 

I xy = —m\x\y\ - m 2 x 2 y 2 ~ m 2 x 2 y 2 ~ 777 4 x 4 y 4 - m 5 x 5 y 5 

= -mi(o)(fl) — m 2 (—a){a) — mj,(—a)(—a) — 777 4 (a)(— a) — 77?5(0)(0) 
= — m(2a 2 — 2a 2 ) = 0, 

which is the same as all other off-diagonal elements. Thus the matrix repre- 
senting the inertia tensor for the configuration shown in Figure 6.2 is 

8 ma 2 0 0 \ 

0 8 ma 2 0 I . 

0 0 'Sima 2 ) 

There’s a great deal of information in the components of this inertia tensor. 
The fact that the off-diagonal elements are all zero means that the selected x-, 
y-, and --axes are “principal axes” for this object and choice of origin, and 
the moments of inertia are “principal moments” of the object. When an object 
rotates about one of the principal axes, the angular momentum vector and the 
angular velocity vector are parallel. This is an indication of the object’s sym- 
metry. In this case, the fact that all three principal moments are equal means 
that this object qualifies as a “spherical top” (in Mechanics, “top” refers to any 
rigid rotating object). And for a spherical top, any three mutually orthogonal 
axes are principal axes. 

If the height of mass 7715 above the plane of the other four masses is increased 
to twice its original height (so that its /-coordinate becomes 4a instead of 2a), 
the greater distance from the x- and y-axes increases the moment of inertia 
about those axes, so that the inertia tensor becomes 

20777a 2 0 0 \ 

0 20777a 2 0 | . 

0 0 8777a 2 / 

Of course, the distance of 7775 from the z-axis remains zero irrespective of its 
height, so this mass is not contributing to the component I zz in either case, and 
that component remains the same. Now that only two of the principal moments 
of inertia are equal, the object is no longer a spherical top, and has become a 
“symmetric top” (and if all three principal moments were different, the object 
is called an “asymmetric top”). One final bit of terminology: if one of the 
principal moments of an object is zero and the other two are equal to one 
another, the object is called a “rotor.” 

Another way to change the inertia tensor of this object is to fiddle with the 
masses of the particles. If, for example, you double the mass of 7775 from its 
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original value of m to 2m, while leaving the other four masses the same, the 
inertia tensor becomes 




0 

12 ma 2 
0 


As expected, there’s no change in the hz component since m$ doesn’t 
contribute to that moment. 

Now consider what will happen to the inertia tensor if you rotate the coor- 
dinate axes. Remember, the inertia tensor is determined for a given location of 
the origin and a given orientation of the coordinate axes, so it seems reasonable 
to expect a change in the components if the coordinate axes are rotated. 

To test this, imagine rotating the coordinate axes counter-clockwise about 
the x-axis, as shown in Figure 6.3. In this figure, you’re looking down the 
x-axis toward the origin, so the y- and ’-axes appear tilted (they’re labeled 
y' and z' to distinguish them from the original y- and ’-axes). In this case, 
the rotation angle is approximately 30°. Figure 6.3(a) shows that the axes 
have rotated while the masses remained in their original positions, while Fig- 
ure 6.3(b) shows the view you would get if you tilted your head to make the 
z'-axis vertical and y'-axis horizontal. 

What effect might this have on the inertia tensor? To determine that, you’ll 
need to know the coordinates of each of the masses in the new (rotated) coordi- 
nate system (that is, you need to know x ' , y' , and z! for each mass). Fortunately, 
Chapter 4 should have given you some idea of how to do that by using a rota- 
tion matrix to convert between the original and rotated coordinates. In this 
case, that rotation matrix is given by 



2a/ 

/ 


/ 


a ^ 4 m 2 


/ 


m 


9 m 5 


/ 

y 


(a) 


(b) 


Figure 6.3 Coordinate axes rotated 30° anti-clockwise around x-axis. 
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( 1 ° ° \( X \ 

I 0 cost) sin# II y I . (6.6) 

\ 0 — sin 0 cos 6 / \ z / 

If you go back to the original masses (all five masses equal to mass m) 
and original height of ms (which is 2 a above the xy plane) and then apply this 
rotation, you should find the following values for the components of the matrix 
representing the inertia tensor: 

/ 8 ma 2 0 0 \ 

I = J 0 8 ma 2 0 I . 

\ 0 0 8 ma 2 ) 

If you’re suprised to find that there’s no change from the original inertia ten- 
sor (the one without the rotation), remember that the symmetry of this object 
makes it a spherical top, which means that any set of three orthogonal axes will 
be principal axes. So tilting the axes should not have caused any change in the 
inertia tensor. 

That sounds reasonable enough, but if you compare the location of the 
masses in Figure 6.3 to the single-mass case shown in Figure 6.1, doesn’t it 
also seem reasonable to expect that ms will produce a component of angular 
momentum in the — y-direction (as the single mass did in Figure 6.1)? 

Yes, it does. And, in fact, mass ms does indeed produce a component of 
angular momentum in the —y-direction. To demonstrate that, just set the other 
four masses to zero and calculate the inertia tensor for ms alone (don’t forget 
that the coordinate axes are rotated). You should get 

/ Ama 2 0 0 

7=1 0 3 ma 2 — \.13ma 2 

\ 0 —1.73 ma 2 ma 2 

So there it is: hz (which represents the y-component of angular momentum 
produced by rotation around the /-axis) is clearly not zero. But why did you 
get zero for all the off-diagonal elements when you first calculated the inertia 
tensor for the pyramid with tilted coordinate axes? The answer is that the other 
four masses also have something to say about the inertia tensor. To isolate their 
contribution to I yz , try setting the mass of ms to zero and leaving the other four 
masses equal to m. The inertia tensor should then be 

/ Ama 2 0 0 

7=1 0 5 ma 2 \.13ma 2 

\ 0 \.13ma 2 Ima 2 
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Figure 6.4 Angular momentum vectors for masses in plane of page. 


And there’s the answer: the other four masses contribute exactly as much 
angular momentum in the positive y-direction as ms contributes to the negative 
y-direction, as illustrated in Figure 6.4. And remember from Chapter 5 that 
you can add tensors by adding their components. So when you add the inertia 
tensor for m 5 to the inertia tensor for the other four masses, you get the (nicely 
diagonal) inertia tensor for the five-mass pyramid. 

To demonstrate the balance between ms and the other four masses, you may 
find it interesting to again move ms up the z-axis to twice its original height 
and then perform the 30 degree rotation of the coordinate axes. In this case, 
you should find the inertia tensor to be 

/ 20 ma 2 0 0 

7=1 0 lima 2 —5.2 ma 2 

\ 0 —5.2 ma 2 lima 2 

and clearly the I yz terms from ms and the other four masses no longer 
cancel. 

You can determine the inertia tensor for any orientation of the coordinate 
axes by applying rotations about multiple axes. If you wish, for example, to 
rotate first about the x-axis by angle 0 \ and then about the y-axis by angle (h, 
you can combine the rotation matrices as 
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/ x' \ / cos 02 0 sin 02 \ / 1 0 0 \ / x \ 

j y' I = j 0 1 0 } [ 0 cos 0\ sin0! J j y J , 

\ z! / \ — sin 02 0 cos 02 / \ 0 — sin 0j cos 0\ ) \ z ) 

(6.7) 

which in the case of two 30 degree rotations (first about the x-axis and then 
about the y-axis) gives a combined rotation matrix of 

/ 0.866 -0.25 4.33 \ / x \ 

j 0 0.866 0.5 J j y J . (6.8) 

V -0.5 -0.433 0.75 ) \ z ) 

If you leave at height 4 a and then apply this rotation to the coordinates, 

the inertia tensor becomes 

( 17.8wa 2 2.6 ma 1 3.9 ma 2 \ 

2.6 ma 2 lima 1 —4.5 ma 2 J . (6.9) 

3.9 ma 2 —4.5 ma 2 13.3 ma 2 ) 



You can perform a quick check on your calculation by verifying that the 
coordinate-axis rotation has changed neither the trace nor the determinant of 
the matrix. 2 

Instead of finding the new coordinates of each mass in the rotated system, 
an alternative approach allows you to find the inertia tensor for rotated coordi- 
nates directly. That approach is to apply a “similarity transform” to the original 
inertia tensor. Here’s how that works: the angular momentum is related to 
the inertia tensor and angular velocity in the original (unrotated) coordinate 
system as 

L = I co. 


and you rotate the coordinates by applying a rotation matrix R (which may be 
the product of several rotation matrices). You can therefore write 

L' = RL = R(I m). 

And since the product of any matrix and its inverse is just the identity matrix, 
you can insert the term R~ 1 R in front of or. 

l! — RL = RI (R~ l R)m 

= (RI R~ l )Rm. 

But Rco is just &/, so 

L' = (RI R~ l )v'. 

2 The matrix review on the book’s website explains how to do these calculations. 
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Thus the expression ( RI R~ l ) relates angular momentum to angular veloc- 
ity in the rotated coordinate system, which means that this expression is the 
inertia tensor in that system. So instead of calculating the new coordinates for 
each mass and plugging them into the equation for the inertia tensor, you can 
instead simply apply the rotation matrix and its inverse to the matrix represent- 
ing the inertia tensor directly (but remember that the sequence matters when 
you’re doing matrix multiplication). 

Using this approach, the process looks like this: 


/ 0.866 

-0.25 

4.33 \ 

/ 20m a 2 

0 

° \ 

° 

0.866 

0.5 ) 

° 

20 ma 2 

° 

V -0.5 

-0.433 

0.75 / 

V 0 

0 

8 ma 2 / 


0.866 

-0.25 

4.33 

0 

0.866 

0.5 

-0.5 

-0.433 

0.75 


( 1 7.8wa 2 2.6 ma 2 

2.6 ma 2 lima 2 

3.9 ma 2 —4.5 ma 2 

which is identical to the result obtained by inserting the rotated coordinates 
into the inertia tensor. 

If you’ve studied matrix algebra, you may be wondering about the possibil- 
ity of finding the principal axes and principal moments by manipulating the 
matrix representing the inertia tensor into a diagonal form. That is certainly 
possible, and you can read about doing that using eigenvectors and eigenvalues 
on this book’s website. 

And if you’re able by visual inspection to determine the angles of rotation 
needed to align the axes with the symmetries of the object, you can use the 
similarity transform approach to diagonalize the inertia matrix. You can see 
how that works by looking at the problems at the end of this chapter and the 
on-line solutions. 


3.9 ma 2 \ 
—4.5 ma 2 I , 
13.3 ma 2 ) 


6.2 The electromagnetic field tensor 

One of the defining characteristics of our modem world is the availabil- 
ity of broadband communication channels which allow near-instantaneous 
transfer of information over great distances without the need for physical con- 
nection. The technology used in this communication descends directly from 
the equations synthesized by Scotsman James Clerk Maxwell in the 1860s, 
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now called “Maxwell’s Equations.” In view of the impact of electromagnetic 
telecommunications on our lives, it’s not surprising that in 2004 the readers of 
Physics World voted Maxwell’s Equations to be the “greatest equations” ever 
developed. 

The four vector equations that have come to be called Maxwell’s Equations 
are Gauss’s Law for electric fields, Gauss’s Law for magnetic fields, Faraday’s 
Law, and the Ampere-Maxwell Law, each of which may be written in inte- 
gral or differential form. The integral forms describe the behavior of electric 
and magnetic fields over surfaces or around paths, while the differential forms 
apply to specific locations. The differential forms are most relevant to the vec- 
tor and tensor operations discussed in this book, involving the scalar product, 
divergence, curl, and partial derivatives discussed in Chapter 2. They’re also 
closely related to the subject of this section, the electromagnetic field-strength 
tensor. 

The differential forms of Maxwell’s Equations are usually written as 

- - p 

Gauss’s Law for electric fields: Vo E — — , 

eo 

Gauss’s Law for magnetic fields: Vofi = 0, 

dB 

Faraday’s Law: Vx£ = ~~Q~' 

- - - dE 

Ampere-Maxwell Law: V x B = pqJ + plq€q — . 

3 1 

In order to understand the electromagnetic tensor, you may find it helpful to 
briefly review the meaning of each of these equations. ' 

Vo E = £- 
Co 

Gauss’s Law for electric fields states that the divergence (Vo) of the electric 
field ( E ) at any location is proportional to the electric charge density (p) at 
that location. That’s because electrostatic field lines begin on positive charge 
and end on negative charge (hence the field lines tend to diverge away from 
locations of positive charge and converge toward locations of negative charge). 

Vo B = 0 

Gauss’s Law for magnetic fields tells you that the divergence (Vo) of the 
magnetic field ( B ) at any location must be zero. This is true because there 
is apparently no isolated “magnetic charge” in the universe, so magnetic field 
lines neither diverge nor converge. 

3 Complete descriptions may be found in any introductory electromagnetics text. 
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Faraday’s Law indicates that the curl (V x) of the electric field ( E) at any loca- 
tion is equal to the negative of the time rate of change of the magnetic field at 
that location. That’s because a changing magnetic field produces a circulating 
electric field. 

Vxfi = /r 0 / + Atoeo|f 

Ampere’s Law, as modified by Maxwell, tells you that the curl (Vx) of the 
magnetic field (B) at any location is proportional to the electric current density 
(J) plus the time rate of change of the electric field at that location. This is 
the case because a circulating magnetic field is produced both by an electric 
current and by a changing electric field. 


Note that Maxwell’s Equations relate the spatial behavior of fields to the 
sources of those fields. Those sources are electric charge (with density p) 
appearing in Gauss’s Law for electric fields, electric current (with density J) 
appearing in the Ampere-Maxwell Law, changing magnetic field (with time 
derivative ^-) appearing in Faraday’s Law, and changing electric field (with 
time derivative ^ ) appearing in the Ampere-Maxwell Law. 

One additional equation is needed to fully characterize electromagnetic 
interactions. That equation is called the “continuity equation,” usually written 
like this: 


where p is the density of electric charge and / is the current density. 

The continuity equation tells you that the time rate of change of the density 
of electric charge (|y) equals the negative of the divergence of the electric 
current density (Vo J). That’s because negative divergence means convergence, 
and if the convergence of the current density J is positive at a point, then more 
positive charge must be arriving at that location than is being carried away. If 
that’s happening, then the density of positive charge at that point must increase 
(meaning that will be positive in this case). 

As valuable as Maxwell’s Equations are individually, the real power of these 
equations is realized by combining them together to produce the wave equa- 
tion. Taking the curl of both sides of Faraday’s Law and inserting the curl of B 
from the Ampere-Maxwell Law results in the equation 


, - d z E 

V-£ = Moeo-^-, 


( 6 . 10 ) 
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where V 2 () = V o V() is the vector form of the Laplacian operator. 4 This 
equation applies to regions in which the charge density (p) and the current 
density ( J ) are both zero. 

You can find a similar equation for the magnetic field by taking the curl of 
both sides of the Ampere-Maxwell Law and then inserting the curl of E from 
Faraday’s Law. This gives 


V 2 fi 


3 2 B 

=lxoeo w 


(6.11) 


It’s instructive to compare Eqs. 6.10 and 6.11 to the general equation for a 
propagating wave: 


V 2 A = 


1 3 2 A 
V 2 3 1 2 


(6.12) 


where v is the speed of propagation of the wave. Note the 1/ir term, which 
leads to the conclusion that the velocity of an electromagnetic wave depends 
only on the electric permittivity (eo) and magnetic permeability (po) of free 
space (specifically, poeo = 1/v 2 , or v = l/^//x„6o = 3 x 10 s m/s). Most 
importantly, that velocity is completely independent of the motion of the 
observer. It was this feature of electromagnetic waves that put Albert Einstein 
onto the path that eventually led to the Theory of Special Relativity. 

To arrive at the Theory of Special Relativity, Einstein held fast to two 
postulates. Those postulates are: 


1) The laws of physics must be the same in all inertial (that is, non- 
accelerating) frames of reference. 

2) The speed of light in a vacuum is constant and does not depend on the 
motion of the source or observer. 


Steadfast faithfulness to these postulates even in the face of counter-intuitive 
conclusions allowed Einstein to see that distances in space and intervals of time 
are not absolute but depend on the relative motion of the observer. Additionally, 
space and time are not separate but are linked together into four-dimensional 
spacetime, and it is the four-dimensional spacetime interval that is invariant 
across all inertial reference frames. 

To understand Einstein’s approach, consider the two Cartesian reference 
frames shown in Figure 6.5. As indicated by the arrow in the figure, the primed 
reference frame is moving with velocity v in the positive x -direction. Using the 
traditional Galilean approach, the coordinate ( x , y, and z) and time (f) values 

4 If you'd like to see the details of the derivation of the electromagnetic-wave equation, you’ll 
find them in the on-line solutions to the problems at the end of this chapter. 
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Figure 6.5 Primed reference frame moving along jc-axis with velocity v. 

for a point measured in both the unprimed and primed coordinate systems are 
related by these equations: 

t' = t, 

x — x — vt, 

/ = y, 

z' = z, 

since the primed frame is moving only in the v-direction.' 

Einstein realized that the second postulate of Special Relativity (the con- 
stancy of the speed of light) is inconsistent with the Galilean transform shown 
above, and that consistent results are obtained only when a different transform 
is used between the unprimed and primed coordinate systems. That transform 
must hold the space-time interval invariant across inertial reference frames. 
But what exactly is the space-time interval (that is, how should you combine 
the space terms and the time term)? 

The answer to that question can be understood by imagining a pulse of light 
radiating spherically outward from a certain location. Calling the speed of 
light c, an observer in the unprimed coordinate system will find the square 
of the distance covered by a wavefront of the light wave in time t to be 

5 These equations assume that the origins of the two coordinate systems coincide at time t = 0. 
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x 2 + y 2 + z 2 = ct 2 . Likewise, an observer in the primed coordinate system 
will write this as x' 2 + y' 2 + z! 2 = ct' 2 . But by the second postulate of special 
relativity, the speed of light must be the same for all observers. So 


ct 


■ r = ct'- 


which indicates that the sign of the time term must be opposite to the sign of 
the spatial terms if the speed of light is to be the same for all observers. Of 
course, the negative sign could equally well be attached to the time term (as 
long as the spatial terms were made positive), and you’ll find some texts using 
that convention. 

The combination of one time and three spatial coordinates into a single 
“four-vector” is best expressed using index notation: 


Xo = ct, 

Xl = X, 

*2 = y, 

X3 = Z, 

in which the speed of light (c) is used in the time term to ensure that all four 
coordinates have dimensions of length. 

Using this notation, the space-time interval ( ds ) can be written as 

{dsf = (, dx °) 2 - (dx 1 ) 2 - (dx 2 ) 2 - ( dx 3 ) 2 . 

This interval is the space-time equivalent of distance (ds 2 = dx 2 + dy 2 + dz 2 ) 
in three-dimensional space. 

Transformations that preserve the invariance of the space-time interval 
across inertial reference frames are called “Lorentz transforms” after the Dutch 
physicist Hendrik Lorentz. For motion in -t-v-direction with speed v, the 
Lorentz transformation is 


*o = v( x o - &x\), 
x\ =y(x i - y6xo), 
x' 2 = x 2 , 
x' 3 = X 3 , 


where 



and 
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1 _ 1 

7 J 1 -p 2 ' 

This form of the space-time interval can be written using the metric tensor 

Safi' 

(ds) 2 = g a pdx a dx ^ , 

in which the tensor g a p corresponds to the Minkowski metric for flat space- 
time. In matrix form, that metric is 



( i 

0 

0 

0 \ 

3 

0 

-1 

0 

0 

8 = 

0 

0 

-1 

0 


V o 

0 

0 

-1 / 


As you may recall if you’ve studied modern physics, the invariance of 
the space-time interval under Lorentz tranformation leads to several interest- 
ing results for observers in different inertial reference frames. Those results 
include: 

(1) Length contraction: An observer in a given reference frame measures 
lengths in a moving reference frame to be contracted along the direction 
of motion. 

(2) Time dilation: An observer in a given reference frame measures time in a 
moving reference frame to run more slowly. 

(3) Relativity of simultaneity: An observer in a given reference frame will 
not agree with an observer in a moving reference frame as to whether two 
events are simultaneous. 

Writing physical laws in a form that clearly fits within the framework of 
Special Relativity has several benefits: such “manifestly covariant’’ laws have 
the same form in all inertial reference frames, and the quantities involved 
transform between reference frames in predictable ways. Any covariant the- 
ory of electromagnetism must incorporate the experimental fact that quantity 
of charge is a scalar (invariant between reference frames), and that Maxwell’s 
Equations and the Lorentz force law are true in all inertial reference frames. 
This requires a tensor version of the electromagnetic field equations and a 
four-vector version of the Lorentz force law, which can be accomplished by 
expressing the electric charge density p and current density / as a four-vector 
called the “four-current”: 


J — ( Cp , J X i Jy , Jfi) - 
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With the four-current in hand, a tensor version of Maxwell’s Equations 
can be achieved by combining the components of the electric and magnetic 
field into an “electromagnetic field tensor.” The matrix representing the 
contravariant version of this tensor is 6 


/ 0 

— E x / c 

Ey/C 

~E-Jc \ 

E x /c 

0 

-B: 

By 

Ey/C 

B z 

0 

- B x 

V E-Jc 

~ By 

B x 

0 ) 


(6.13) 


The covariant version of this tensor can be found by lowering the indices 
using the metric tensor. The result is 


( ° 

E x /c 

Ey/C 

EJc \ 

—E x /c 

0 

-Bz 

By 

—Ey/c 

B z 

0 

-By 

K ~ E z /c 

-By 

By 

0 ) 


(6.14) 


Another useful tensor is the dual contravariant electromagnetic field tensor 


( 0 

~B X 

-By 

-B z 

\ 

B x 

0 

E z /c 

-Ey/C 


By 

-EJc 

0 

E x /c 


\ B z 

Ey/C 

-E x /c 

0 

/ 


(6.15) 


One benefit of these tensor expressions is that all of Maxwell’s Equations 
may now be expressed using just two tensor equations. Those two equations 
are: 


dF 01 ? 

3x“ 


= F-ojP , 


(6.16) 


and 


dd afl 

dx a 


(6.17) 


Where are Maxwell’s Equations in these expressions? Well, to find Gauss’s 
Law for electric fields, take /3 = 0 in Eq. 6.16: 


d f m 

dx a 


= MO J°- 


6 You should be aware that there are almost as many versions of this matrix as there are authors; 
this book’s website has an explanation of the reasons for the differences between the versions 
found in several popular texts. 


6.2 The electromagnetic field tensor 


179 


Inserting the values from the electromagnetic field-strength tensor of Eq. 6.13 
and summing over the dummy index a gives 


9(0) d{E x /c) 

d(ct) dx 


d(E y /c) 

dy 


B(E z /c) 

dz 


= Mo(cp). 


Thus 


d(E x ) d(Ey ) 
dx dy 


d(E z ) 

dz 


= Mo (c 2 p), 


and, since c 2 = l/(eoMo), 


d(E x ) d(E y ) d(E z ) 


dx 


dy 


dz 


MO 

eoMo 


~P, 


or 


Vo E = —, 
eo 

which is Gauss’s Law for electric fields. 

To get the Ampere-Maxwell Law, look at the equations that result from 
setting ft equal to 1, 2, and 3 in Eq. 6.16: 

dF a ' 


dx 01 

dF al 


= Mo T 1 - 


= Mo J 


dx° 

dF a3 y 

— = Mo-/ . 

dx a 

As above, just insert the values from the electromagnetic field-strength tensor 
of Eq. 6.13 and sum over the dummy index or. 


d{—E x /c) 

d(ct) 

B(-Ey/c) 

d(ct) 

K-EJc) 

d(ct) 


9(0) d(B z ) d(-By) 
dx + dy + dz 
d(~B z ) , 9(0) , d(B x ) 


dx 

d(By) 

dx 


+ 


+ 


dy 
B(-B x ) 
dy 


dz 


= do(Jx), 
= MO (Jy), 


x. 9(0) 

H — 7 . — — Mo Uz)- 
dz 


Hence 


3{B Z ) 

d(By) 

dy 

dz 

d(B x ) 

3 (B z ) 

dz 

dx 

3 (By) 

3 (B x ) 

dx 

dy 


— M0(4) + 


1 B(E X ) 
c 2 dt 


— MO (Jy) + 


1 B(Ey) 
c 1 dt 


— Mo Uz) + 


1 B(E.) 
c 2 dt 
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Recognizing the partial derivatives of the magnetic field as the components of 
the curl of B, this is 

- - - 3 E 

VxB = p>oJ + poeo — , 
at 

the Ampere-Maxwell Law. 

The other two Maxwell Equations (Gauss’s Law for magnetic fields and 
Faraday’s Law) may be obtained in a similar fashion using the dual electro- 
magnetic field-strength tensor (Eq. 6.15). For example, to find Gauss’s Law 
for magnetic fields, take /3 = 0 in Eq. 6.17: 


33 “° 

dx a 


= 0. 


Inserting the values from the dual electromagnetic field-strength tensor of 
Eq. 6.15 and summing over the dummy index a gives 

3(0) 3 (B x ) 3 (By) 3 (B.) 

d(ct) + dx + dy + dz 


which is 

V o B = 0, 


Gauss’s Law for magnetic fields. 

And to get Faraday’s Law, look at the equations that result from setting 
equal to 1, 2, and 3 in Eq. 6.17: 


33 “ 1 

dx a 

33“ 2 

dx a 

33“ 3 

3x" 


= 0 , 
= 0, 
= 0. 


As before, just insert the values from the dual electromagnetic field-strength 
tensor of Eq. 6.15 and sum over the dummy index or. 


3 (~B X ) 3(0) 3 j-EJc) d(E y /c) 

d(ct ) dx dy dz 

3 (~B y ) 3 (EJc) 3(0) d{-E x /c) 

d(ct) dx + dy + dz 

3 (~B Z ) d(-E y /c) 3 (E x /c) 3(0) 

d(ct) + dx + dy + dz 
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So 


9(£ v ) 8(E Z ) d(B x ) 


dz 

HE Z ) 

dx 

d(E x ) 


dy 

d(E x ) 

dz 

d(E y ) 


dt 

d(B y ) 
dt ’ 
B(B Z ) 
dt 


dy dx 

Recognizing the partial derivatives of the electric field as the components of 
the curl of E , this is Faraday’s Law: 

- - dB 

V x E = . 

dt 

So the use of tensors allows you to write Maxwell’s Equations in a simpler 
form. But the real power of tensors is to help you understand the behavior 
of electric and magnetic fields when viewed from different reference frames. 
Specifically, by transforming to a moving reference frame, it becomes clear 
that electric and magnetic fields depend on the state of motion of the observer. 

To see how that comes about, imagine an observer in a reference frame 
moving along the positive x-axis at a constant speed v. You can investi- 
gate the behavior of electric and magnetic fields as seen by this observer by 
transforming the electromagnetic field tensor to the observer’s reference frame. 

Recall the Lorentz transform matrix for motion along the x-axis with 
speed v: 

( Y -YP 0 0 \ 

— yP y 0 0 

0 0 10 

Vo o 01/ 


A = 


(6.18) 


So to transform to the primed coordinate system, use 


F' = 


which is 

/ 

Y 

-yP 

0 

0 ^ 

/ 

^ii 

II 


-YP 

Y 

0 

0 




0 

0 

1 

0 




0 

0 

0 

1 ) 

V 


AFA t , 


0 

— E 

c/C 

- 

Ey/c 

E z / C 

\ 

E x /c 

0 


~B Z 

B v 



Ey/c 

B 

2 


0 

-B 



Ez/c 

-By 


B x 

0 


/ 



/ 

Y 

1 

o 

0 

\ 


Y 


-yP 

Y 

0 

0 





0 

0 

1 

0 





0 

0 

0 

1 

/ 
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Multiplying the center matrix by the right matrix gives 


/ (- E x /c)(-yP ) (—E x /c)(y) ~E y /c ~E z /c\ 

( E x /c)(y ) ( E x /c)(-yP ) -B z B y 

(E y /c)(y) + (B z )(-yP) (E v /c)(-yP) + (B z )(y) 0 -B x 

\( E z /c)(y) + (-B y )(-yP) (E z /c)(-yP) + (B y )(-y) B x 0 


which, when multiplied by the left array, gives 


/ ( E x /c)y 2 p - ( E x /c)y 2 p 
(E x /c)y 2 - ( E x /c)y 2 p 2 
( E y /c)y - {B z )yfi 
V (■ E z /c)y + ( By)yP 


(E x /c)y 2 + ( E x /c)y 2 p 2 

0 

-(E y /c)yp + (B z )y 
-(E z /c)yfi - (. B y )y 


-(E y /c)y + (B z )yp 
( Ey/c)yp - ( B z )y 
0 

B x 


— (E z /c)y - (By)yp \ 
(£ z /c)yi8 + (B v )y 

-B x 

0 / 


Thus 


/ o 

E x /c 

y(E y /c — fiB z ) 
\ y(E z /c + pBy) 

y(Ey/c — PB Z ) 

— y(B Z — PEy/c) 

0 

B x 


— E x /c 
0 

y(B z - PEy/c) 
- y(By + pE z /c ) 

-y(E z /c + PBy) \ 
y(#y + PE z /c) 

-B x 

0 


Comparing this to Eq. 6.13, the components of the electric field in the new 
(primed) coordinate system can be related to the components of the electric 
field in the original (unprimed) coordinate system by 

E' x = E x , 

E' y = cy(E y /c — PB Z ), (6.19) 

E' z = cy(E z /c + pB y ), 

and the magnetic field components in the new (primed) system are 

B x = B x , 

B' y = y(B y + pE z /c), 

B z = y(B z - PEy/c). 


(6.20) 
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This is a profound result, since it indicates that the existence of electric and 
magnetic fields depends on the motion of the observer. 

To understand the implications of these results, consider the case in which 
E x = E y = E z = 0 but one or more components of B are non-zero (this 
occurs, for example, when a long, straight wire carries a steady electric cur- 
rent). This means that an observer in the unprimed coordinate system sees 
a magnetic field but no electric field. However, transforming to the primed 
coordinate system, Eqs. 6.19 and 6.20 tell you that an observer in the primed 
coordinate system sees both electric and magnetic fields (since in this case 
E' = —cy/3B : and El = cyfSB y ). So does the magnetic field exist or not? 
The answer depends on the motion of the observer. 

Now consider a case in which B x = B y = B- =0 but one or more 
components of E are non-zero in the unprimed system (for example, an elec- 
tric charge at rest in the unprimed system). For this case, an observer in the 
primed system does see a magnetic field with components B' = yf3E z /c and 
Bl = —yfiEy/c (this makes sense, since the observer in the primed system 
sees a moving electric charge, which is an electric current, and electric currents 
produce magnetic fields). Cases such as these explain the reasoning behind the 
statement that electric and magnetic fields “have no independent existence.” 

The problems at the end of this chapter will give you an idea of the relative 
magnitudes of fields seen by an observer at rest and a second observer moving 
at a significant fraction of the speed of light. 


6.3 The Riemann curvature tensor 

In the decade after publishing his Theory of Special Relativity in 1905, Albert 
Einstein turned his attention to what he called a “deficiency” in classical 
mechanics: the lack of an explanation for the precise equality of inertial and 
gravitational mass. An object’s inertial mass determines its resistance to accel- 
eration, and its gravitational mass determines its response to a gravitational 
field. The equality of these differently defined masses cannot be explained 
by classical mechanics, and Einstein’s scientific instincts told him that the 
resolution of this deficiency could be achieved by “an extension of the prin- 
ciple of relativity to spaces of reference which are not in uniform motion 
relative to one another.” 7 He applied the word “General” to this extension of 
his theory of relativity because this new theory would not be restricted to the 
non-accelerating reference frames of Special Relativity. 

7 A. Einstein, The Meaning of Relativity. 
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Early in his work on the General Theory, Einstein constructed a Gedanken- 
experiment (that is, a mental exercise) in which he imagined a group of objects 
with different mass far away from the Earth and from all other masses - you 
can think of this as a bunch of rocks far out in space. The behavior of these 
objects is observed from two reference systems, one of which is called sys- 
tem K and is “inertial” or non-accelerating with respect to the rocks. The other 
system, called system K', is in uniform acceleration with respect to the first. 
For an observer in the K' system, the objects all accelerate in the same direc- 
tion (opposite to the direction of the acceleration of the K' system) and at 
the same rate (equal to the rate of acceleration of the K' system). Seeing all 
objects accelerating in the same direction and at the same rate, that observer 
would be entirely justified in concluding that the acceleration of the objects 
is produced by an external gravitational field and that the K' system is at rest. 
Einstein realized that both the K and the K' systems are valid frames of refer- 
ence, and he termed the complete equivalence of such systems the “principle 
of equivalence.” 

Einstein’s next step was to overlay the z'-axis of K' system with the z-axis 
of the K system and then to allow the K' system to rotate about the z'-axis 
with uniform angular speed (recall that a rotating object experiences centripetal 
acceleration, so rotation makes K' an accelerated system). If system K' were 
not rotating, the size of objects and rate of time flow measured in the K and K' 
systems would be the same. But when system K' is rotating, objects at rest in 
K' will be moving when measured in the K system and will therefore experi- 
ence length contraction and time dilation, and the amount of contraction and 
dilation will depend on the location of the objects (since objects farther from 
the rotation axis will have higher velocity). Since the principle of equivalence 
demands that an accelerated system and a system at rest in a gravitational field 
are equivalent, Einstein was forced to conclude that length contraction and 
time dilation could also be produced by gravity, or as he put it “the gravita- 
tional field influences and even determines the metrical laws of the space-time 
continuum.” 

Those metrical laws are expressed using tensors, so the General Theory 
of Relativity relies on tensor formulation of physical laws and on concepts 
described in earlier chapters, such as the metric tensor, Christoffel symbols, 
and covariant derivatives. The most important tensor in General Relativity is 
the Riemann curvature tensor, sometimes called the Riemann-Christoffel ten- 
sor after the nineteenth-century German mathematicians Bernhard Riemann 
and Elwin Bruno Christoffel. The importance of this tensor stems from the 
fact that non-zero components are the hallmark of curvature; the vanishing of 
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the Riemann tensor is both a necessary and a sufficient condition for Euclidean 
(flat) space. 

Most texts use one of two ways to derive the Riemann curvature tensor: 
parallel transport or the commutator of the covariant derivative. To understand 
the parallel-transport approach, you should first understand that “parallel trans- 
port” refers to a method of moving a vector around a space while keeping the 
length and direction of the vector the same. In Cartesian flat space, making 
sure the vector’s magnitude and direction don’t change is straightforward - 
just move the vector around without allowing the x-, y-, or -z components to 
change. If the components don’t change, then the length and the direction of 
the vector don’t change, and this satisfies the requirements of parallel transport. 

In curved spaced, the situation is more complex. For one thing, “pointing 
in the same direction” becomes more difficult to define. Consider the two- 
dimensional space that is the surface of the Earth (and pretend for the moment 
that it’s perfectly smooth). Imagine a vector that is initially at the equator (say 
a bit north of Quito, Ecuador) and is pointing due north, directly along the 
meridian line. Now imagine transporting that vector toward the north pole, 
all the while making sure it remains pointed exactly along the meridian line. 
Remember, the entire space is the surface of the Earth, so the vector must 
remain tangent to the surface (that is, locally horizontal) as you move it. If you 
continue moving your vector along the meridian line and pass over the North 
Pole and then “down” the other side of the Earth, you will eventually reach the 
equator again somewhere near the middle of Indonesia. Your vector will still 
be pointing along the meridian, but now it will be pointing south. So although 
you’ve kept your vector pointing “in the same direction” (that is, along the 
meridian) over the entire trip, it’s gone from pointing north to pointing south. 

Now imagine making another trip, also starting with a north-pointing vector 
at the equator near Quito, but this time moving along the equator instead of 
over the North Pole. Once again, as you move you make sure that your vector 
continues to point north (along the local meridian). After a long journey, you 
arrive in the middle of Indonesia, but this time you find that your vector is 
pointing north. Hence the direction of the vector at the end of the journey 
depends on the path taken, even though you used parallel transport in each 
case. And whenever the result of parallel transport is a change in the direction 
of a vector, you can be sure you’re dealing with a curved space. 

This raises a larger issue: it’s not possible to add, subtract, multiply, or in 
any way compare vectors at different locations - you have to transport one of 
the vectors to the location of the other before you can perform such operations. 
That’s no problem in flat space, because you can parallel-transport a vector to 
any other location simply by keeping its coefficients constant (ensuring that the 
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vector’s length is constant and that it remains pointed in the same direction). 
But while “pointed in the same direction” is easily defined at different locations 
in flat space, you’ve just seen that this phrase is problematic in curved space. 
Hence a more-general definition of parallel transport is required. 

In that definition, “parallel transport” is defined as transport for which the 
covariant derivative is zero. Remember that the covariant derivative is the com- 
bination of two terms, the first of which is the usual partial derivative, and the 
second of which involves a Christoffel symbol. As described in Section 5.7 
in Chapter 5, the purpose of that second term is to account for changes in 
the basis vectors. Holding the covariant derivative at zero while transporting a 
vector around a small loop is one way to derive the Riemann tensor. ' 

The Riemann curvature tensor falls naturally out of the commutator of the 
covariant derivative of a vector. In this usage, “commutator” refers to the dif- 
ference that results from performing two operations first in one order and then 
in the reverse order. So if one operator is denoted by A and another operator 
by B, the commutator is defined as [AB] = AB— BA. Thus if the sequence of 
the two operations has no impact on the result, the commutator has a value of 
zero. 

To get to the Riemann tensor, the operation of choice is covariant differenti- 
ation. That’s because in a flat space the order of covariant differentiation makes 
no difference, so the commutator must yield zero. Any non-zero result of 
applying the commutator to covariant differentiation can therefore be attributed 
to the curvature of the space. 

To begin this process, take the covariant derivative of vector V a first with 
respect to x@: 

V a ;p = ^-T° p V <J . (6.21) 

Now call this result V a p and take another covariant derivative (this time with 
respect to x y ): 

V a p-,y = ^ - K y V T p - r; Y V av . (6.22) 

Substituting the expression from Eq. 6.21 into this equation gives 


Vaf$ m ,y 


9 2 Va d Kp _ a dVo_ 

dx y dxP dxr a afi dx y 


- T. 


ay 



cr 

rP 




( BVg 
\ dx 



(6.23) 


You can find the details in Schutz, A First Course in General Relativity , Cambridge University 
Press, 2009. 
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It’s not easy to see the physical significance in this expression, but remember 
how you got here: first by finding the incremental change in V a as you take a 
small step in the -direction, and then finding the change in that quantity as 
you take a small step in the x Y -direction. And now you’re going to compare the 
result of these two operations with the result you get when you take the steps 
in reverse order - from the same starting point, you’ll first find the incremental 
change in V a as you take a small step in the x r -direction, after which you’ll 
find the change in that quantity as you take a small step in the x^ -direction. 

To take the covariant derivatives in the opposite order, differentiate first with 
respect to x y : 

Vcr.y = Hi ~ r «Y V "- ( 6 . 24 ) 

Call this result V ay and take another covariant derivative (this time with respect 
to x&y. 

Vay;P = ~^p ^ afi Vjy — V ypV aJ1 . (6.25) 

As before, you can substitute the expression from Eq. 6.24 into this equation 
to get 


Vfyv'B — 


d z V a 


ay;P - dxPdxY 


9r ^ „ _ r . 9^ 

dxP a ay dxP 


r r 

1 ap 


dVr 
dx y 

yP V dx n 


r CT v 

1 zy y o 


r a v 

1 art y o 


(6.26) 


In flat space, the order of covariant differentiation should make no differ- 
ence, so Eq. 6.26 should be identical to Eq. 6.23. Any differences between 
these equations can therefore be attributed to the curvature of the space. 
Examining these two equations term by term, the first terms are equal: 

d 2 V a = d 2 Va 

dx y dxP dxPdx y ’ 


(these terms are equal because the order of normal partial derivatives does 
not matter). Hence these terms cancel in the commutator. Now comparing the 
second terms, 


01 a p 

dx y 


V„^~ 


dr 


ay 


dxP 


V a 


so these terms do not cancel one another. Comparing the third term of Eq. 6.23 
to the fourth term of Eq. 6.26, they’re found to be equal: 
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_ r a 9^ = _ r . 

aP dx y ap dx y ’ 

because the symbols used for dummy indices (o and r) are irrelevant. The 
fourth term of Eq. 6.23 equals the third term of Eq. 6.26: 

T dv T „ dv a 

nl 1 p(7 u 

- ay dxP~~ ay dxP’ 
for the same reason. The fifth terms are not equal: 


r Iy t % v ° * V%v«- 

But the sixth terms are equal: 

_pj; d Vg _ dV a 

Py fall - yP dx , 1 ’ 

because Christoffel symbols are symmetric in their lower indices. The seventh 
terms are equal for the same reason: 


p r CT v — r ' 7 r CT v 

1 P y L Ctt] v cr ~ 1 yP 1 a r) v ° • 

So when the commutator AB— BA is formed, most of the terms cancel out, 
but the second and fifth terms remain after subtraction. Those terms are 


Vaft\y V<xy',p — 


3T 


OL^t 


dx y 

3E 


V a 


drz. 


ay 


dxP 


Va + r x r°„v a - T z a oT a v c 


ay 


9T 


afi 


(6.27) 


dx@ 


dx y 


r T r 17 

1 ay L zft 




V a . 


The terms within the parentheses define the Riemann curvature tensor: 


per 

K oi Py = 


ar, 


ay 


3T 


a(3 


dx@ 


dx y 


pi per pi per 

1 cty 1 rfi 1 ap 1 ry 


(6.28) 


If you’re wondering why the curvature tensor involves the derivative of 
Christoffel symbols, consider this: in any space, you can always define a coor- 
dinate system for which the Christoffel symbols are all zero at some point. But 
unless the space is flat, the Christoffel symbols will not be zero at all other 
locations, which means that the partial derivatives of the Christoffel symbols 
will not be zero. So a necessary and sufficient condition for flat space is that 


Kpy = 0- (6-29) 

Another tensor related to the Riemann curvature tensor is the Ricci ten- 
sor, which you can find by contracting the Riemann tensor along the a and 
indices. In four dimensions, this is 


Ray = Raay ~ R l\y + R-hy + R lt,y + R t*y 


(6.30) 
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If you contract the Ricci tensor by raising one index and setting it equal to 
the other, the result is the Ricci scalar. Again in four dimensions, this is 

R = g a YR ay = R y y = Rj + R\ + R\ + Rl (6.31) 

Finally, the tensor known as the “Einstein tensor” can be written as a 
combination of the Ricci tensor, the Ricci scalar, and the metric: 

1 

G a y = Ray ~^Rgay ■ (6.32) 

This is the tensor that appears in Einstein’s field equation for General 
Relativity, often written as 

8j rG 

Gfi V + rg^ v = — (6.33) 

where T /IV is the energy-momentum tensor and T is the “cosmological con- 
stant” introduced by Einstein to maintain a static Universe. It is this equation 
that gives rise to the first half of the concise statement of General Relativity: 
“Matter tells spacetime how to curve, and spacetime tells matter how to move.” 

To appreciate the full content of the Riemann tensor, consider a two- 
dimensional space that is the surface of a sphere. The metric for such a 
space is 

ds 2 = a 2 dd 2 + a 2 sin 2 {0)d(p 2 , 


from which the components of the metric tensor may be found to be 


a 2 , 


g<90 — g</)6 = 0, 

£00 = a 2 sin 2 ( 0 ). 

Inserting these values into the equation for Christoffel symbols gives 


(6.34) 


rl = V 

•J 2 8 


dgik dgjk dgij 


3 xJ dx l dx k m 

Even in two dimensions, writing out all the terms of the Christoffel symbols 
can be something of a chore: 

1 


r « — I 


4>e ' 

d6 ' ° " 89 


4>e ' 

d6 ' ° " 89 


dd 


r H — — 
1 <pe ~ 9 


r'V' 

1 ee ~ 


8 dcp +gV d<p +8 de +1 


+ + f a* 


de 


„„ — 1 _ ™ 

S 86 S 86 


U 1 

86 ' ° 86 




a <t>e 

8(j) 

W 8 ge0 
86 8 8<j> 

09 f 


36> 


8(j> _ 


+ — — 1 — s 

86 5 86 5 86 


„00 


3 <f> 
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e,/, a 8ee m dgep 
8 dc/> 8 d<j> 

6<p a 8(p6 _j_ ( p(p a 8(p(p 

8 ae 8 ae 
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But given the metric tensor components shown in Eq. 6.34, all the partial 
derivatives except those involving are zero, as are any terms involving 
£e <0 or £ 00 - That leaves only three non-zero Christoffel symbols, which are 
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= cot(0), 

r 0 _ / 1 \ 00 d £00 

H \2y 8 dd 
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With the Christoffel symbols for the spherical surface in hand, the components 
of the Riemann curvature tensor may be found using 

ar- arc 

Dff a Y a P I pT pCT pT pCT 

K aPy — Q x p Q x y + l ay l rP 1 aP L xy 

As in most tensor equations, the full content of this tensor can only be appre- 
ciated by writing out the components. Not only must you allow each of the 
indices er, a, fi, and y to represent both 0 and <p, you must also allow the 
dummy index r to represent both 0 and ip and then sum those terms. Hence 
in two-dimensional space, the last two terms of the Riemann tensor equation 
(those involving the products of the Christoffel symbols) become four terms, 
making a total of six terms for each set of indices. The first eight components 
of the Riemann tensor can be found by setting a equal to 0 and letting the other 
indices represent both 0 and ip: 
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Inserting the Christoffel symbols found above, you can see that the non-zero 
components are 
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this means the surviving terms from the a — 6 group are 
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Now allowing a to equal tp, the other eight terms are 
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Tensor applications 
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Again inserting the Christoffel symbols, the non-zero terms are found to be 


And since 


and 




3F^ 

01 e<t> 

39 

ar 0 

01 6i 

~~d 6 



■ r ; 

0 

60 



sin(0) 

sin(0) 


cos 2 (0) 

sin 2 (0) 
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the surviving terms are 

R% 0lP = -[1 + cot 2 (0)] + cot 2 {6 ) = -1, 
< 0e = [1 +cot 2 (0)]-cot 2 (0) = F 


As expected, a two-dimensional space with the metric of a sphere ( ds 2 = 
a 2 dd 2 + « 2 sin 2 (O)c/0 2 ) has non-zero components of the Riemann curvature 
tensor, confirming that this space is non-Euclidean. 

You can see how to use these results to find the Ricci tensor and the Ricci 
scalar in the on-line solutions to the problems at the end of this chapter. 


6.4 Chapter 6 problems 

6. 1 Find the inertia tensor for a cubical arrangement of eight identical masses 
with the origin of coordinates at one of the corners and the coordinate 
axes along the edges of the cube. 
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6.2 How would the moment of inertia tensor of Problem 6. 1 change if one of 
the eight masses is removed? 

6.3 Find the moment of inertia tensor for the arrangement of masses of Prob- 
lem 6.2 if the coordinate system is rotated by 20 degrees about one of 
the coordinate axes (do this by finding the locations of the masses in the 
rotated coordinate system). 

6.4 Use the similarity-transform approach to verify the moment of inertia 
tensor you found in Problem 6.3. 

6.5 Show how the vector wave equation results from taking the curl of both 
sides of Faraday’s Law and inserting the curl of the magnetic field from 
the Ampere-Maxwell Law. 

6.6 If an observer in one coordinate system measures an electric field of 
5 volts per meter in the z -direction and zero magnetic field, what electric 
and magnetic fields would be measured by a second observer moving at 
1/4 the speed of light along the x-axis? 

6.7 If an observer in one coordinate system measures a magnetic field of 1.5 
tesla in the z -direction and zero electric field, what electric and magnetic 
fields would be measured by a second observer moving at 1/4 the speed 
of light along the x-axis? 

6.8 Show that E o B is invariant under Lorentz transformation. 

6.9 The differential line element in 2-D Euclidean space may be expressed 
in polar coordinates as ds 2 = dr 2 + r 2 d0 2 . Show that the Riemann 
curvature tensor equals zero in this case, as it must for any flat space. 

6.10 Find the Ricci tensor and scalar for the 2-sphere of Section 6.3. 
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convergence, 46 
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products of inertia, 1 64 


Index 


197 


Pythagorean theorem, 1 0 

radial acceleration, 72 
reciprocal basis vectors, 1 14 
relativity of simultaneity, 177 
Ricci scalar, 1 89 
Ricci tensor, 188 
Riemann, Bernhard, 184 
Riemann curvature tensor, 1 83 
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Einstein, 189 
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